Abstract 摘要
Purpose 目的
The purpose of this paper is to provide a comprehensive, yet concise, overview of the considerations and metrics required for partial least squares structural equation modeling (PLS-SEM) analysis and result reporting. Preliminary considerations are summarized first, including reasons for choosing PLS-SEM, recommended sample size in selected contexts, distributional assumptions, use of secondary data, statistical power and the need for goodness-of-fit testing. Next, the metrics as well as the rules of thumb that should be applied to assess the PLS-SEM results are covered. Besides presenting established PLS-SEM evaluation criteria, the overview includes the following new guidelines: PLSpredict (i.e., a novel approach for assessing a model’s out-of-sample prediction), metrics for model comparisons, and several complementary methods for checking the results’ robustness.
本文的目的是提供一个全面而简明的概述,涵盖进行偏最小二乘结构方程模型(PLS-SEM)分析和结果报告所需的考虑因素和指标。首先总结了初步考虑因素,包括选择 PLS-SEM 的原因、在特定背景下推荐的样本大小、分布假设、使用二手数据、统计功效以及进行拟合优度检验的必要性。接下来,介绍了评估 PLS-SEM 结果时应应用的指标和经验法则。除了呈现已建立的 PLS-SEM 评估标准外,概述还包括以下新指南:PLSpredict(即评估模型外样本预测的新方法)、模型比较的指标以及几种检查结果稳健性的补充方法。
Design/methodology/approach
设计/方法论/方法
This paper provides an overview of previously and recently proposed metrics as well as rules of thumb for evaluating the research results based on the application of PLS-SEM.
本文概述了基于 PLS-SEM 应用评估研究结果的先前和最近提出的指标以及经验法则。
Findings 研究结果
Most of the previously applied metrics for evaluating PLS-SEM results are still relevant. Nevertheless, scholars need to be knowledgeable about recently proposed metrics (e.g. model comparison criteria) and methods (e.g. endogeneity assessment, latent class analysis and PLSpredict), and when and how to apply them to extend their analyses.
大多数先前应用于评估 PLS-SEM 结果的指标仍然是相关的。然而,学者们需要了解最近提出的指标(例如模型比较标准)和方法(例如内生性评估、潜在类别分析和 PLSpredict),以及何时和如何应用它们以扩展他们的分析。
Research limitations/implications
研究局限性/影响
Methodological developments associated with PLS-SEM are rapidly emerging. The metrics reported in this paper are useful for current applications, but must always be up to date with the latest developments in the PLS-SEM method.
与 PLS-SEM 相关的方法论发展正在迅速出现。本文报告的指标对当前应用是有用的,但必须始终与 PLS-SEM 方法的最新发展保持同步。
Originality/value 原创性/价值
In light of more recent research and methodological developments in the PLS-SEM domain, guidelines for the method’s use need to be continuously extended and updated. This paper is the most current and comprehensive summary of the PLS-SEM method and the metrics applied to assess its solutions.
鉴于在 PLS-SEM 领域最近的研究和方法论发展,关于该方法使用的指南需要不断扩展和更新。本文是对 PLS-SEM 方法及其解决方案评估所应用指标的最新和最全面的总结。
Keywords 关键词
Citation 引用
Hair, J.F., Risher, J.J., Sarstedt, M. and Ringle, C.M. (2019), "When to use and how to report the results of PLS-SEM", European Business Review, Vol. 31 No. 1, pp. 2-24. https://doi.org/10.1108/EBR-11-2018-0203
Hair, J.F., Risher, J.J., Sarstedt, M. 和 Ringle, C.M. (2019),“何时使用及如何报告 PLS-SEM 的结果”,《欧洲商业评论》,第 31 卷第 1 期,页码 2-24。https://doi.org/10.1108/EBR-11-2018-0203
Publisher
:Emerald Publishing Limited
出版商:翡翠出版有限公司
Copyright © 2019, Emerald Publishing Limited
版权 © 2019,翡翠出版有限公司
Introduction 引言
For many years, covariance-based structural equation modeling (CB-SEM) was the dominant method for analyzing complex interrelationships between observed and latent variables. In fact, until around 2010, there were far more articles published in social science journals that used CB-SEM instead of partial least squares structural equation modeling (PLS-SEM). In recent years, the number of published articles using PLS-SEM increased significantly relative to CB-SEM (Hair et al., 2017b). In fact, PLS-SEM is now widely applied in many social science disciplines, including organizational management (Sosik et al., 2009), international management (Richter et al., 2015), human resource management (Ringle et al., 2019), management information systems (Ringle et al., 2012), operations management (Peng and Lai, 2012), marketing management (Hair et al., 2012b), management accounting (Nitzl, 2016), strategic management (Hair et al., 2012a), hospitality management (Ali et al., 2018b) and supply chain management (Kaufmann and Gaeckler, 2015). Several textbooks (e.g., Garson, 2016; Ramayah et al., 2016), edited volumes (e.g., Avkiran and Ringle, 2018; Ali et al., 2018a), and special issues of scholarly journals (e.g., Rasoolimanesh and Ali, 2018; Shiau et al., 2019) illustrate PLS-SEM or propose methodological extensions.
多年来,基于协方差的结构方程模型(CB-SEM)一直是分析观察变量和潜在变量之间复杂关系的主要方法。事实上,直到 2010 年左右,社会科学期刊中发表的使用 CB-SEM 的文章远远超过使用偏最小二乘结构方程模型(PLS-SEM)的文章。近年来,使用 PLS-SEM 发表的文章数量相对于 CB-SEM 显著增加(Hair et al., 2017b)。实际上,PLS-SEM 现在广泛应用于许多社会科学学科,包括组织管理(Sosik et al., 2009)、国际管理(Richter et al., 2015)、人力资源管理(Ringle et al., 2019)、管理信息系统(Ringle et al., 2012)、运营管理(Peng and Lai, 2012)、市场管理(Hair et al., 2012b)、管理会计(Nitzl, 2016)、战略管理(Hair et al., 2012a)、酒店管理(Ali et al., 2018b)和供应链管理(Kaufmann and Gaeckler, 2015)。几本教科书(例如,Garson, 2016;Ramayah et al., 2016)和编辑卷(例如,Avkiran and Ringle, 2018;Ali et al.)也对此进行了讨论。2018a),以及学术期刊的特刊(例如,Rasoolimanesh 和 Ali,2018;Shiau 等,2019)展示了 PLS-SEM 或提出了方法论扩展。
The PLS-SEM method is very appealing to many researchers as it enables them to estimate complex models with many constructs, indicator variables and structural paths without imposing distributional assumptions on the data. More importantly, however, PLS-SEM is a causal-predictive approach to SEM that emphasizes prediction in estimating statistical models, whose structures are designed to provide causal explanations (Wold, 1982; Sarstedt et al., 2017a). The technique thereby overcomes the apparent dichotomy between explanation – as typically emphasized in academic research – and prediction, which is the basis for developing managerial implications (Hair et al., 2019). Additionally, user-friendly software packages are available that generally require little technical knowledge about the method, such as PLS-Graph (Chin, 2003) and SmartPLS (Ringle et al., 2015; Ringle et al., 2005), while more complex packages for statistical computing software environments, such as R, can also execute PLS-SEM (e.g. semPLS; Monecke and Leisch, 2012). Authors such as Richter et al. (2016), Rigdon (2016) and Sarstedt et al. (2017a) provide more detailed arguments and discussions on when to use and not to use PLS-SEM.
PLS-SEM 方法对许多研究者具有很大的吸引力,因为它使他们能够在不对数据施加分布假设的情况下,估计具有多个构念、指标变量和结构路径的复杂模型。然而,更重要的是,PLS-SEM 是一种因果预测的结构方程模型方法,强调在估计统计模型时的预测,其结构旨在提供因果解释(Wold, 1982;Sarstedt 等, 2017a)。因此,该技术克服了学术研究中通常强调的解释与预测之间的明显二分法,后者是制定管理启示的基础(Hair 等, 2019)。此外,市面上有用户友好的软件包,通常对该方法的技术知识要求较低,例如 PLS-Graph(Chin, 2003)和 SmartPLS(Ringle 等, 2015;Ringle 等, 2005),而更复杂的统计计算软件环境包,如 R,也可以执行 PLS-SEM(例如 semPLS;Monecke 和 Leisch, 2012)。如 Richter 等(2016)、Rigdon(2016)和 Sarstedt 等的作者。 (2017a)提供了关于何时使用和不使用 PLS-SEM 的更详细论据和讨论。
The objective of this paper is to explain the procedures and metrics that are applied by editors and journal review boards to assess the reporting quality of PLS-SEM findings. We first summarize several initial considerations when choosing to use PLS-SEM and cover aspects such as sample sizes, distributional assumptions and goodness-of-fit testing. Then, we discuss model evaluation, including rules of thumb and introduce important advanced options that can be used. Our discussion also covers PLSpredict, a new method for assessing a model’s out-of-sample predictive power (Shmueli et al., 2016; Shmueli et al., 2019), which researchers should routinely apply, especially when drawing conclusions that affect business practices and have managerial implications. Next, we introduce several complementary methods for assessing the results’ robustness when it comes to measurement model specification, nonlinear structural model effects, endogeneity and unobserved heterogeneity (Hair et al., 2018; Latan, 2018). Figure 1 illustrates the various aspects that we discuss in the following sections.
本文的目的是解释编辑和期刊审查委员会用于评估 PLS-SEM 研究结果报告质量的程序和指标。我们首先总结了选择使用 PLS-SEM 时的一些初步考虑因素,并涵盖样本大小、分布假设和拟合优度检验等方面。接着,我们讨论模型评估,包括经验法则,并介绍可以使用的重要高级选项。我们的讨论还涵盖了 PLSpredict,这是一种评估模型样本外预测能力的新方法(Shmueli 等,2016;Shmueli 等,2019),研究人员应定期应用,特别是在得出影响商业实践和具有管理意义的结论时。接下来,我们介绍了几种补充方法,用于评估结果在测量模型规范、非线性结构模型效应、内生性和未观察到的异质性方面的稳健性(Hair 等,2018;Latan,2018)。图 1 展示了我们在后续部分讨论的各个方面。
Preliminary considerations
初步考虑
The Swedish econometrician Herman O. A. Wold (1975, 1982, 1985) developed the statistical underpinnings of PLS-SEM. The method was initially known and is sometimes still referred to as PLS path modeling (Hair et al., 2011). PLS-SEM estimates partial model structures by combining principal components analysis with ordinary least squares regressions (Mateos-Aparicio, 2011). This method is typically viewed as an alternative to Jöreskog’s (1973) CB-SEM, which has numerous – typically very restrictive – assumptions (Hair et al., 2011).
瑞典计量经济学家赫尔曼·O·A·沃尔德(1975 年、1982 年、1985 年)发展了 PLS-SEM 的统计基础。该方法最初被称为 PLS 路径建模(Hair 等,2011 年),有时仍然如此称呼。PLS-SEM 通过将主成分分析与普通最小二乘回归相结合来估计部分模型结构(Mateos-Aparicio,2011 年)。该方法通常被视为约雷斯科(Jöreskog,1973 年)的 CB-SEM 的替代方案,后者有许多通常非常严格的假设(Hair 等,2011 年)。
Jöreskog’s (1973) CB-SEM, which is often executed by software packages such as LISREL or AMOS, uses the covariance matrix of the data and estimates the model parameters by only considering common variance. In contrast, PLS-SEM is referred to as variance-based, as it accounts for the total variance and uses the total variance to estimate parameters (Hair et al., 2017b).
Jöreskog(1973)的 CB-SEM 通常通过 LISREL 或 AMOS 等软件包执行,使用数据的协方差矩阵,并仅考虑共同方差来估计模型参数。相比之下,PLS-SEM 被称为基于方差的方法,因为它考虑了总方差,并使用总方差来估计参数(Hair 等,2017b)。
In the past decade, there has been a considerable debate about which situations are more or less appropriate for using PLS-SEM (Goodhue et al., 2012; Marcoulides et al., 2012; Marcoulides and Saunders, 2006; Rigdon, 2014a; Henseler et al., 2014; Khan et al., 2019). In the following sections, we summarize several initial considerations when to use PLS-SEM (Hair et al., 2013). Furthermore, we compare the differences between CB-SEM and PLS-SEM (Marcoulides and Chin, 2013; Rigdon, 2016). In doing so, we note that recent research has moved beyond the CB-SEM versus PLS-SEM debate (Rigdon et al., 2017; Rigdon, 2012), by establishing PLS-SEM as a distinct method for analyzing composite-based path models. Nevertheless, applied research is still confronted with the choice between the two SEM methods. Researchers should select PLS-SEM:
在过去十年中,关于在何种情况下使用 PLS-SEM 更为合适的讨论相当广泛(Goodhue 等,2012;Marcoulides 等,2012;Marcoulides 和 Saunders,2006;Rigdon,2014a;Henseler 等,2014;Khan 等,2019)。在接下来的部分中,我们总结了使用 PLS-SEM 时的一些初步考虑(Hair 等,2013)。此外,我们比较了 CB-SEM 和 PLS-SEM 之间的差异(Marcoulides 和 Chin,2013;Rigdon,2016)。在此过程中,我们注意到,最近的研究已超越了 CB-SEM 与 PLS-SEM 的争论(Rigdon 等,2017;Rigdon,2012),将 PLS-SEM 确立为分析基于复合路径模型的独特方法。然而,应用研究仍然面临选择这两种 SEM 方法之间的困境。研究人员应选择 PLS-SEM:
when the analysis is concerned with testing a theoretical framework from a prediction perspective;
当分析涉及从预测角度测试理论框架时;when the structural model is complex and includes many constructs, indicators and/or model relationships;
当结构模型复杂并包含多个构念、指标和/或模型关系时;when the research objective is to better understand increasing complexity by exploring theoretical extensions of established theories (exploratory research for theory development);
当研究目标是通过探索已建立理论的理论扩展来更好地理解日益复杂性时(用于理论发展的探索性研究);when the path model includes one or more formatively measured constructs;
当路径模型包含一个或多个形成性测量构念时;when the research consists of financial ratios or similar types of data artifacts;
当研究包含财务比率或类似类型的数据工件时;when the research is based on secondary/archival data, which may lack a comprehensive substantiation on the grounds of measurement theory;
当研究基于二手/档案数据时,这可能缺乏在测量理论基础上的全面证实;when a small population restricts the sample size (e.g. business-to-business research); but PLS-SEM also works very well with large sample sizes;
当小规模人群限制样本大小时(例如,企业对企业研究);但 PLS-SEM 在大样本大小下也表现得非常好;when distribution issues are a concern, such as lack of normality; and
当分布问题成为关注点时,例如缺乏正态性;when research requires latent variable scores for follow-up analyses.
当研究需要潜在变量得分进行后续分析时。
The above list provides an overview of points to consider when deciding whether PLS is an appropriate SEM method for a study.
上述列表提供了在决定 PLS 是否适合某项研究的 SEM 方法时需要考虑的要点概述。
Sample size 样本大小
PLS-SEM offers solutions with small sample sizes when models comprise many constructs and a large number of items (Fornell and Bookstein, 1982; Willaby et al., 2015; Hair et al., 2017b). Technically, the PLS-SEM algorithm makes this possible by computing measurement and structural model relationships separately instead of simultaneously. In short, as its name implies, the algorithm computes partial regression relationships in the measurement and structural models by using separate ordinary least squares regressions. Reinartz et al. (2009), Henseler et al. (2014) and Sarstedt et al. (2016b) summarize how PLS-SEM provides solutions when methods such as CB-SEM develop inadmissible results or do not converge with complex models and small sample sizes, regardless of whether the data originates from a common or composite model population. Hair et al. (2013) indicate that certain scholars have falsely and misleadingly taken advantage of these characteristics to generate solutions with extremely small sample sizes, even when the population is large and accessible without much effort. This practice has unfortunately damaged the reputation of PLS-SEM to some extent (Marcoulides et al., 2009). Like other multivariate methods, PLS-SEM is not capable of turning a poor (e.g. non-representative) sample into a proper one to obtain valid model estimations.
PLS-SEM 在模型包含多个构念和大量项目时,为小样本提供了解决方案(Fornell 和 Bookstein,1982;Willaby 等,2015;Hair 等,2017b)。从技术上讲,PLS-SEM 算法通过分别计算测量模型和结构模型的关系,而不是同时进行,从而实现了这一点。简而言之,正如其名称所暗示的,算法通过使用单独的普通最小二乘回归来计算测量模型和结构模型中的部分回归关系。Reinartz 等(2009),Henseler 等(2014)和 Sarstedt 等(2016b)总结了 PLS-SEM 如何在 CB-SEM 等方法产生不可接受的结果或在复杂模型和小样本情况下不收敛时提供解决方案,无论数据来源于共同模型还是复合模型人群。Hair 等(2013)指出,某些学者错误且误导性地利用这些特征,以极小的样本量生成解决方案,即使人群庞大且易于获取。这种做法不幸在一定程度上损害了 PLS-SEM 的声誉(Marcoulides 等,2009)。 与其他多变量方法一样,PLS-SEM 无法将一个不良(例如,非代表性)样本转变为一个合适的样本,以获得有效的模型估计。
PLS-SEM can certainly be used with smaller samples but the population’s nature determines the situations in which small sample sizes are acceptable (Rigdon, 2016). Assuming that other situational characteristics are equal, the more heterogeneous the population, the larger the sample size needed to achieve an acceptable sampling error (Cochran, 1977). If basic sampling theory guidelines are not considered (Sarstedt et al., 2018), questionable results are produced. To determine the required sample size, researchers should rely on power analyses that consider the model structure, the anticipated significance level and the expected effect sizes (Marcoulides and Chin, 2013). Alternatively, Hair et al. (2017a) have documented power tables indicating the required sample sizes for a variety of measurement and structural model characteristics. Finally, Kock and Hadaya (2018) suggest the inverse square root method and the gamma‐exponential method as two new approaches for minimum sample size calculations.
PLS-SEM 确实可以用于较小的样本,但总体的性质决定了小样本量可接受的情况(Rigdon,2016)。假设其他情境特征相等,人口越异质,所需的样本量就越大,以达到可接受的抽样误差(Cochran,1977)。如果不考虑基本抽样理论指南(Sarstedt 等,2018),则会产生可疑的结果。为了确定所需的样本量,研究人员应依赖于考虑模型结构、预期显著性水平和预期效应量的功效分析(Marcoulides 和 Chin,2013)。另外,Hair 等(2017a)记录了功效表,指示了各种测量和结构模型特征所需的样本量。最后,Kock 和 Hadaya(2018)建议使用反平方根法和伽马-指数法作为两种新的最小样本量计算方法。
Akter et al. (2017) note that most prior research on sample size requirements in PLS-SEM overlooked the fact that the method also proves valuable for analyzing large data quantities. In fact, PLS-SEM offers substantial potential for analyzing large data sets, including secondary data, which often does not include comprehensive substantiation on the grounds of measurement theory (Rigdon, 2013).
Akter 等(2017)指出,以往关于 PLS-SEM 样本量要求的大多数研究忽视了该方法在分析大量数据时的价值。实际上,PLS-SEM 在分析大型数据集方面具有相当大的潜力,包括二手数据,而这些数据通常并不包含关于测量理论的全面证据(Rigdon,2013)。
Distributional assumptions
分布假设
Many scholars indicate that the absence of distributional assumptions is the main reason for choosing PLS-SEM (Hair et al., 2012b; Nitzl, 2016; do Valle and Assaker, 2016). While this is clearly an advantage of using PLS-SEM in social science studies, which almost always rely on nonnormal data, on its own, it is not a sufficient justification.
许多学者指出,缺乏分布假设是选择 PLS-SEM 的主要原因(Hair 等,2012b;Nitzl,2016;do Valle 和 Assaker,2016)。虽然这显然是使用 PLS-SEM 进行社会科学研究的一个优势,因为社会科学研究几乎总是依赖于非正态数据,但仅此一点并不足以作为充分的理由。
Scholars have noted that maximum likelihood estimation with CB-SEM is robust against violations of normality (Chou et al., 1991; Olsson et al., 2000), although it may require much larger sample sizes (Boomsma and Hoogland, 2001). If the size of the data set is limited, CB-SEM can produce abnormal results when data are nonnormal (Reinartz et al., 2009), while PLS-SEM shows a higher robustness in these situations (Sarstedt et al., 2016b).
学者们指出,使用 CB-SEM 的最大似然估计对正态性违反具有鲁棒性(Chou 等,1991;Olsson 等,2000),尽管这可能需要更大的样本量(Boomsma 和 Hoogland,2001)。如果数据集的大小有限,当数据不符合正态分布时,CB-SEM 可能会产生异常结果(Reinartz 等,2009),而 PLS-SEM 在这些情况下表现出更高的鲁棒性(Sarstedt 等,2016b)。
It is noteworthy that in a limited number of situations, nonnormal data can also affect PLS-SEM results (Sarstedt et al., 2017a). For instance, bootstrapping with nonnormal data can produce peaked and skewed distributions. The use of the bias-corrected and accelerated (BCa) bootstrapping routine handles this issue to some extent, as it adjusts the confidence intervals for skewness (Efron, 1987). Only choosing PLS-SEM for data distribution reasons is, therefore, in most instances not sufficient, but it is definitely an advantage in combination with other reasons for using PLS-SEM.
值得注意的是,在有限的情况下,非正态数据也会影响 PLS-SEM 的结果(Sarstedt 等,2017a)。例如,使用非正态数据进行自助法可能会产生尖峰和偏斜的分布。使用偏差校正和加速(BCa)自助法在一定程度上解决了这个问题,因为它调整了偏斜度的置信区间(Efron,1987)。因此,仅仅因为数据分布的原因选择 PLS-SEM 在大多数情况下是不够的,但与使用 PLS-SEM 的其他原因结合起来时,这确实是一个优势。
Secondary data 次级数据
Secondary (or archival) data are increasingly available to explore real-world phenomena (Avkiran and Ringle, 2018). Research which is based on secondary data typically focuses on a different objective than in a standard CB-SEM analysis, which is strictly confirmatory in nature. More precisely, secondary data are mainly used in exploratory research to propose causal relationships in situations which have little clearly defined theory (Hair et al., 2017a, 2017b). Such settings require researchers to put greater emphasis on examining all possible relationships rather than achieving model fit (Nitzl, 2016). By its nature, this process creates large complex models that cannot be analyzed with the full information CB-SEM method. In contrast, the iterative approach of PLS-SEM uses limited information, making the method more robust and not constrained by the requirements of CB-SEM (Hair et al., 2014). Thus, PLS-SEM is very suitable for exploratory research with secondary data, because it offers the flexibility needed for the interplay between theory and data (Nitzl, 2016) or, as Wold (1982 p. 29) notes, “soft modeling is primarily designed for research contexts that are simultaneously data-rich and theory-skeletal.” Furthermore, the increasing popularity of secondary data analysis (e.g. by using data that stem from company databases, social media, customer tracking, national statistical bureaus or publicly available survey data) shifts the research focus from strictly confirmatory to predictive and causal-predictive modeling. Such research settings are a perfect fit for the prediction-oriented PLS-SEM approach.
次级(或档案)数据越来越多地可用于探索现实世界现象(Avkiran 和 Ringle,2018)。基于次级数据的研究通常关注的目标与标准的 CB-SEM 分析有所不同,后者本质上是严格的确认性研究。更准确地说,次级数据主要用于探索性研究,以提出在理论定义不明确的情况下的因果关系(Hair 等,2017a,2017b)。这种环境要求研究者更加重视检验所有可能的关系,而不是实现模型拟合(Nitzl,2016)。从本质上讲,这一过程会创建大型复杂模型,而这些模型无法通过完整信息的 CB-SEM 方法进行分析。相比之下,PLS-SEM 的迭代方法使用有限信息,使得该方法更加稳健,不受 CB-SEM 要求的限制(Hair 等,2014)。因此,PLS-SEM 非常适合使用次级数据进行探索性研究,因为它提供了理论与数据之间相互作用所需的灵活性(Nitzl,2016)或,如 Wold(1982 年,第 p.)。 29) 指出,“软建模主要是为同时数据丰富和理论稀缺的研究环境而设计的。”此外,二次数据分析(例如,使用来自公司数据库、社交媒体、客户追踪、国家统计局或公开可用的调查数据的数据)日益受到欢迎,这将研究重点从严格的确认性转向预测性和因果预测建模。这种研究环境非常适合以预测为导向的 PLS-SEM 方法。
PLS-SEM also proves valuable for analyzing secondary data from a measurement theory perspective. Unlike survey measures, which are usually crafted to confirm a well-developed theory, measures used in secondary data sources are typically not created and refined over time for confirmatory analyses (Sarstedt and Mooi, 2019). Thus, achieving model fit with secondary data measures is unlikely in most research situations when using CB-SEM. Furthermore, researchers who use secondary data do not have the opportunity to revise or refine the measurement model to achieve fit. Another major advantage of PLS-SEM in this context is that it permits the unrestricted use of single-item and formative measures (Hair et al., 2017a). This is extremely valuable for archival research, because many measures are actually artifacts found in corporate databases, such as financial ratios and other firm-fixed factors (Hair et al., 2014). This is extremely valuable for archival research, because many measures are actually artifacts found in corporate databases, such as financial ratios and other firm-fixed factors (Richter et al., 2016). Often, several types of financial data may be used to create an index as a measure of performance (Sarstedt et al., 2017a, 2017b). For instance, Ittner et al. (1997) operationalized strategy with four indicators as follows: the ratio of research and development to sales, the market-to-book ratio, the ratio of employees to sales and the number of new product or service introductions. Similarly, secondary data could be used to form an index of a company’s communication activities, covering aspects such as online advertising, sponsoring or product placement (Sarstedt and Mooi, 2019). PLS-SEM should always be the preferred approach in situations with formatively measured constructs, because a MIMIC approach in CB-SEM imposes constraints on the model that often contradict the theoretical assumptions (Sarstedt et al., 2016b).
PLS-SEM 在从测量理论的角度分析二手数据方面也证明了其价值。与通常旨在确认成熟理论的调查测量不同,二手数据源中使用的测量通常并不是为了确认性分析而创建和逐步完善的(Sarstedt 和 Mooi,2019)。因此,在大多数研究情况下,使用 CB-SEM 时,使用二手数据测量实现模型拟合的可能性不大。此外,使用二手数据的研究人员没有机会修订或完善测量模型以实现拟合。在这种情况下,PLS-SEM 的另一个主要优势是它允许不受限制地使用单项和形成性测量(Hair 等,2017a)。这对于档案研究极为重要,因为许多测量实际上是企业数据库中发现的文物,例如财务比率和其他公司固定因素(Hair 等,2014)。这对于档案研究极为重要,因为许多测量实际上是企业数据库中发现的文物,例如财务比率和其他公司固定因素(Richter 等,2016)。 通常,可以使用几种类型的财务数据来创建一个作为绩效衡量标准的指数(Sarstedt et al., 2017a, 2017b)。例如,Ittner et al.(1997)通过以下四个指标对战略进行了操作化:研发与销售的比率、市场与账面比率、员工与销售的比率以及新产品或服务推出的数量。同样,次级数据可以用于形成公司沟通活动的指数,涵盖在线广告、赞助或产品植入等方面(Sarstedt 和 Mooi, 2019)。在具有形成性测量构念的情况下,PLS-SEM 应始终是首选方法,因为 CB-SEM 中的 MIMIC 方法对模型施加的限制往往与理论假设相矛盾(Sarstedt et al., 2016b)。
Statistical power 统计功效
When using PLS-SEM, researchers benefit from the method’s high degree of statistical power compared to CB-SEM (Reinartz et al., 2009; Hair et al., 2017b). This characteristic holds even when estimating common factor model data as assumed by CB-SEM (Sarstedt et al., 2016b). Greater statistical power means that PLS-SEM is more likely to identify relationships as significant when they are indeed present in the population (Sarstedt and Mooi, 2019).
在使用 PLS-SEM 时,研究人员受益于该方法相较于 CB-SEM 的高统计效能(Reinartz 等,2009;Hair 等,2017b)。即使在估计 CB-SEM 所假设的共同因子模型数据时,这一特性仍然成立(Sarstedt 等,2016b)。更高的统计效能意味着 PLS-SEM 更有可能在总体中确实存在关系时将其识别为显著关系(Sarstedt 和 Mooi,2019)。
The PLS-SEM characteristic of higher statistical power is quite useful for exploratory research that examines less developed or still developing theory. Wold (1985, p. 590) describes the use of PLS-SEM as “a dialogue between the investigator and the computer. Tentative improvements of the model–such as the introduction of a new latent variable, an indicator, or an inner relation, or the omission of such an element–are tested for predictive relevance […] and the various pilot studies are a speedy and low-cost matter.” Of particular importance, however, is that PLS-SEM is not only appropriate for exploratory research but also for confirmatory research (Hair et al., 2017a).
PLS-SEM 的高统计功效特性对于探索性研究非常有用,尤其是那些考察不太成熟或仍在发展的理论的研究。Wold(1985 年,第 590 页)将 PLS-SEM 的使用描述为“研究者与计算机之间的对话。模型的初步改进——例如引入新的潜变量、指标或内在关系,或省略这样的元素——会被测试其预测相关性……而各种试点研究则是快速且低成本的。”然而,特别重要的是,PLS-SEM 不仅适用于探索性研究,也适用于验证性研究(Hair 等,2017a)。
Goodness-of-fit 拟合优度
While CB-SEM strongly relies on the concept of model fit, this is much less the case with PLS-SEM (Hair et al., 2019). Consequently, some researchers incorrectly conclude that PLS-SEM is not useful for theory testing and confirmation (Westland, 2015). A couple of methodologists have endorsed model fit measures for PLS-SEM (Henseler et al., 2016a), but researchers should be very cautious when considering the applicability of these measures for PLS-SEM (Henseler and Sarstedt, 2013; Hair et al., 2019). First, a comprehensive assessment of these measures has not been conducted so far. Therefore, any thresholds (guidelines) advocated in the literature should be considered as very tentative. Second, as the algorithm for obtaining PLS-SEM solutions is not based on minimizing the divergence between observed and estimated covariance matrices, the concept of Chi-square-based model fit measures and their extentions – as used in CB-SEM – are not applicable. Hence, even bootstrap-based model fit assessments on the grounds of, for example, some distance measure or the SRMR (Henseler et al., 2016a; Henseler et al., 2017), which quantify the divergence between the observed and estimated covariance matrices, should be considered with extreme caution. Third, scholars have questioned whether the concept of model fit, as applied in the context of CB-SEM research, is of value to PLS-SEM applications in general (Hair et al., 2017a; Rigdon, 2012; Lohmöller, 1989).
虽然 CB-SEM 在很大程度上依赖于模型拟合的概念,但 PLS-SEM 则不然(Hair 等,2019)。因此,一些研究者错误地得出结论,认为 PLS-SEM 对理论测试和确认没有用处(Westland,2015)。一些方法学家支持 PLS-SEM 的模型拟合度量(Henseler 等,2016a),但研究者在考虑这些度量在 PLS-SEM 中的适用性时应非常谨慎(Henseler 和 Sarstedt,2013;Hair 等,2019)。首先,迄今为止尚未对这些度量进行全面评估。因此,文献中倡导的任何阈值(指导方针)都应被视为非常初步。其次,由于获取 PLS-SEM 解的算法并不是基于最小化观察到的协方差矩阵与估计协方差矩阵之间的差异,因此基于卡方的模型拟合度量及其扩展——如在 CB-SEM 中使用的——并不适用。因此,即使是基于自助法的模型拟合评估,例如基于某种距离度量或 SRMR(Henseler 等,2016a;Henseler 等)。2017 年),量化观察到的协方差矩阵与估计协方差矩阵之间的差异的指标,应谨慎对待。第三,学者们质疑在 CB-SEM 研究背景下应用的模型拟合概念是否对 PLS-SEM 应用具有普遍价值(Hair 等,2017a;Rigdon,2012;Lohmöller,1989)。
PLS-SEM primarily focuses on the interplay between prediction and theory testing and results should be validated accordingly (Shmueli, 2010). In this context, scholars have recently proposed new evaluation procedures that are designed specifically for PLS-SEM’s prediction-oriented nature (Shmueli et al., 2016).
PLS-SEM 主要关注预测与理论检验之间的相互作用,结果应相应地进行验证(Shmueli,2010)。在这种背景下,学者们最近提出了专门针对 PLS-SEM 预测导向特性的新的评估程序(Shmueli 等,2016)。
Evaluation of partial least squares-structural equation modeling results
部分最小二乘法-结构方程模型结果的评估
The first step in evaluating PLS-SEM results involves examining the measurement models. The relevant criteria differ for reflective and formative constructs. If the measurement models meet all the required criteria, researchers then need to assess the structural model (Hair et al., 2017a). As with most statistical methods, PLS-SEM has rules of thumb that serve as guidelines to evaluate model results (Chin, 2010; Götz et al., 2010; Henseler et al., 2009; Chin, 1998; Tenenhaus et al., 2005; Roldán and Sánchez-Franco, 2012; Hair et al., 2017a). Rules of thumb – by their very nature – are broad guidelines that suggest how to interpret the results, and they typically vary depending on the context. As an example, reliability for exploratory research should be a minimum of 0.60, while reliability for research that depends on established measures should be 0.70 or higher. The final step in interpreting PLS-SEM results, therefore, involves running one or more robustness checks to support the stability of results. The relevance of these robustness checks depends on the research context, such as the aim of the analysis and the availability of data.
评估 PLS-SEM 结果的第一步是检查测量模型。相关标准对于反射性和形成性构念是不同的。如果测量模型满足所有要求的标准,研究人员接下来需要评估结构模型(Hair et al., 2017a)。与大多数统计方法一样,PLS-SEM 有一些经验法则作为评估模型结果的指导原则(Chin, 2010;Götz et al., 2010;Henseler et al., 2009;Chin, 1998;Tenenhaus et al., 2005;Roldán 和 Sánchez-Franco, 2012;Hair et al., 2017a)。经验法则本质上是广泛的指导方针,建议如何解释结果,通常根据上下文而有所不同。例如,探索性研究的可靠性应至少为 0.60,而依赖于既定测量的研究的可靠性应为 0.70 或更高。因此,解释 PLS-SEM 结果的最后一步涉及进行一个或多个稳健性检验,以支持结果的稳定性。这些稳健性检验的相关性取决于研究背景,例如分析的目的和数据的可用性。
Assessing reflective measurement models
评估反射测量模型
The first step in reflective measurement model assessment involves examining the indicator loadings. Loadings above 0.708 are recommended, as they indicate that the construct explains more than 50 per cent of the indicator’s variance, thus providing acceptable item reliability.
反射测量模型评估的第一步涉及检查指标负荷。建议负荷值高于 0.708,因为这表明该构念解释了指标方差的 50%以上,从而提供了可接受的项目可靠性。
The second step is assessing internal consistency reliability, most often using Jöreskog’s (1971) composite reliability. Higher values generally indicate higher levels of reliability. For example, reliability values between 0.60 and 0.70 are considered “acceptable in exploratory research,” values between 0.70 and 0.90 range from “satisfactory to good.” Values of 0.95 and higher are problematic, as they indicate that the items are redundant, thereby reducing construct validity (Diamantopoulos et al., 2012; Drolet and Morrison, 2001). Reliability values of 0.95 and above also suggest the possibility of undesirable response patterns (e.g. straight lining), thereby triggering inflated correlations among the indicators’ error terms. Cronbach’s alpha is another measure of internal consistency reliability that assumes similar thresholds, but produces lower values than composite reliability. Specifically, Cronbach’s alpha is a less precise measure of reliability, as the items are unweighted. In contrast, with composite reliability, the items are weighted based on the construct indicators’ individual loadings and, hence, this reliability is higher than Cronbach’s alpha. While Cronbach’s alpha may be too conservative, the composite reliability may be too liberal, and the construct’s true reliability is typically viewed as within these two extreme values. As an alternative, Dijkstra and Henseler (2015) proposed ρA as an approximately exact measure of construct reliability, which usually lies between Cronbach’s alpha and the composite reliability. Hence, ρA may represent a good compromise if one assumes that the factor model is correct.
第二步是评估内部一致性可靠性,通常使用 Jöreskog(1971)的复合可靠性。较高的值通常表示更高的可靠性水平。例如,0.60 到 0.70 之间的可靠性值被认为在探索性研究中是“可接受的”,而 0.70 到 0.90 之间的值则被认为是“令人满意到良好”。0.95 及以上的值是有问题的,因为它们表明项目是冗余的,从而降低了构念效度(Diamantopoulos 等,2012;Drolet 和 Morrison,2001)。0.95 及以上的可靠性值还暗示了不良反应模式的可能性(例如,直线排列),从而引发指标误差项之间的虚高相关性。Cronbach 的α系数是另一种内部一致性可靠性的测量方法,其假设类似的阈值,但产生的值低于复合可靠性。具体而言,Cronbach 的α系数是一个不太精确的可靠性测量,因为项目是未加权的。相比之下,复合可靠性则根据构念指标的个别载荷对项目进行加权,因此其可靠性高于 Cronbach 的α系数。 虽然克朗巴赫α可能过于保守,但复合可靠性可能过于宽松,构念的真实可靠性通常被视为在这两个极端值之间。作为替代,Dijkstra 和 Henseler(2015)提出了ρ A 作为构念可靠性的近似精确测量,通常位于克朗巴赫α和复合可靠性之间。因此,如果假设因子模型是正确的,ρ A 可能代表一个良好的折衷。
In addition, researchers can use bootstrap confidence intervals to test if the construct reliability is significantly higher than the recommended minimum threshold (e.g. the lower bound of the 95 per cent confidence interval of the construct reliability is higher than 0.70). Similarly, they can test if construct reliability is significantly lower than the recommended maximum threshold (e.g. the upper bound of the 95 per cent confidence interval of the construct reliability is lower than 0.95). To obtain the bootstrap confidence intervals, in line with Aguirre-Urreta and Rönkkö (2018), researchers should generally use the percentile method. However, when the reliability coefficient’s bootstrap distribution is skewed, the BCa method should be preferred to obtain bootstrap confidence intervals.
此外,研究人员可以使用自助法置信区间来检验构念可靠性是否显著高于推荐的最低阈值(例如,构念可靠性的 95%置信区间的下限高于 0.70)。同样,他们可以检验构念可靠性是否显著低于推荐的最高阈值(例如,构念可靠性的 95%置信区间的上限低于 0.95)。为了获得自助法置信区间,按照 Aguirre-Urreta 和 Rönkkö(2018)的建议,研究人员通常应使用百分位数法。然而,当可靠性系数的自助法分布存在偏斜时,应优先使用 BCa 方法来获得自助法置信区间。
The third step of the reflective measurement model assessment addresses the convergent validity of each construct measure. Convergent validity is the extent to which the construct converges to explain the variance of its items. The metric used for evaluating a construct’s convergent validity is the average variance extracted (AVE) for all items on each construct. To calculate the AVE, one has to square the loading of each indicator on a construct and compute the mean value. An acceptable AVE is 0.50 or higher indicating that the construct explains at least 50 per cent of the variance of its items.
反射测量模型评估的第三步涉及每个构念测量的收敛效度。收敛效度是指构念在多大程度上聚合以解释其项目的方差。用于评估构念收敛效度的指标是每个构念所有项目的平均方差提取(AVE)。计算 AVE 时,需要将每个指标在构念上的载荷平方并计算平均值。可接受的 AVE 值为 0.50 或更高,表明该构念至少解释了其项目方差的 50%。
The fourth step is to assess discriminant validity, which is the extent to which a construct is empirically distinct from other constructs in the structural model. Fornell and Larcker (1981) proposed the traditional metric and suggested that each construct’s AVE should be compared to the squared inter-construct correlation (as a measure of shared variance) of that same construct and all other reflectively measured constructs in the structural model. The shared variance for all model constructs should not be larger than their AVEs. Recent research indicates, however, that this metric is not suitable for discriminant validity assessment. For example, Henseler et al. (2015) show that the Fornell-Larcker criterion does not perform well, particularly when the indicator loadings on a construct differ only slightly (e.g. all the indicator loadings are between 0.65 and 0.85).
第四步是评估区分效度,即一个构念在经验上与结构模型中的其他构念的区别程度。Fornell 和 Larcker(1981)提出了传统的度量方法,并建议将每个构念的平均方差提取(AVE)与该构念及所有其他反射性测量构念的平方构念间相关性(作为共享方差的度量)进行比较。所有模型构念的共享方差不应大于它们的 AVE。然而,最近的研究表明,这一度量方法不适合用于区分效度评估。例如,Henseler 等人(2015)显示,Fornell-Larcker 标准的表现不佳,特别是在构念的指标负荷仅有微小差异时(例如,所有指标负荷在 0.65 和 0.85 之间)。
As a replacement, Henseler et al. (2015) proposed the heterotrait-monotrait (HTMT) ratio of the correlations (Voorhees et al., 2016). The HTMT is defined as the mean value of the item correlations across constructs relative to the (geometric) mean of the average correlations for the items measuring the same construct. Discriminant validity problems are present when HTMT values are high. Henseler et al. (2015) propose a threshold value of 0.90 for structural models with constructs that are conceptually very similar, for instance cognitive satisfaction, affective satisfaction and loyalty. In such a setting, an HTMT value above 0.90 would suggest that discriminant validity is not present. But when constructs are conceptually more distinct, a lower, more conservative, threshold value is suggested, such as 0.85 (Henseler et al., 2015). In addition to these guidelines, bootstrapping can be applied to test whether the HTMT value is significantly different from 1.00 (Henseler et al., 2015) or a lower threshold value such as 0.85 or 0.90, which should be defined based on the study context (Franke and Sarstedt, 2019). More specifically, the researcher can examine if the upper bound of the 95 per cent confidence interval of HTMT is lower than 0.90 or 0.85.
作为替代,Henseler 等人(2015)提出了异特征-单特征(HTMT)相关性比率(Voorhees 等人,2016)。HTMT 被定义为跨构念的项目相关性的平均值,相对于测量同一构念的项目的平均相关性的(几何)平均值。当 HTMT 值较高时,存在区分效度问题。Henseler 等人(2015)为概念上非常相似的构念(例如认知满意度、情感满意度和忠诚度)的结构模型提出了 0.90 的阈值。在这种情况下,HTMT 值超过 0.90 将表明不存在区分效度。但当构念在概念上更为不同时,建议使用较低且更保守的阈值,例如 0.85(Henseler 等人,2015)。除了这些指导方针外,还可以应用自助法来测试 HTMT 值是否显著不同于 1.00(Henseler 等人,2015)或较低的阈值,例如 0.85 或 0.90,这应根据研究背景进行定义(Franke 和 Sarstedt,2019)。 更具体地说,研究者可以检查 HTMT 的 95%置信区间的上限是否低于 0.90 或 0.85。
Assessing formative measurement models
评估形成性测量模型
PLS-SEM is the preferred approach when formative constructs are included in the structural model (Hair et al., 2019). Formative measurement models are evaluated based on the following: convergent validity, indicator collinearity, statistical significance, and relevance of the indicator weights (Hair et al., 2017a).
PLS-SEM 是在结构模型中包含形成性构念时的首选方法(Hair et al., 2019)。形成性测量模型的评估基于以下几个方面:收敛效度、指标共线性、统计显著性和指标权重的相关性(Hair et al., 2017a)。
For formatively measured constructs, convergent validity is assessed by the correlation of the construct with an alternative measure of the same concept. Originally proposed by Chin (1998), the procedure is referred to as redundancy analysis. To execute this procedure for determining convergent validity, researchers must plan already in the research design stage to include alternative reflectively measured indicators of the same concept in their questionnaire. Cheah et al. (2018) show that a single-item, which captures the essence of the construct under consideration, is generally sufficient as an alternative measure – despite limitations with regard to criterion validity (Sarstedt et al., 2016a). When the model is based on secondary data, a variable measuring a similar concept would be used (Houston, 2004). Hair et al. (2017a) suggest that the correlation of the formatively measured construct with the single-item construct, measuring the same concept, should be 0.70 or higher.
对于形成性测量的构念,收敛效度通过构念与同一概念的替代测量之间的相关性来评估。该程序最初由 Chin(1998)提出,称为冗余分析。为了执行这一程序以确定收敛效度,研究人员必须在研究设计阶段就计划在问卷中包含同一概念的替代反射性测量指标。Cheah 等人(2018)表明,捕捉所考虑构念本质的单一项目通常作为替代测量是足够的——尽管在标准效度方面存在局限性(Sarstedt 等,2016a)。当模型基于二手数据时,将使用测量相似概念的变量(Houston,2004)。Hair 等人(2017a)建议,形成性测量构念与测量同一概念的单一项目构念之间的相关性应为 0.70 或更高。
The variance inflation factor (VIF) is often used to evaluate collinearity of the formative indicators. VIF values of 5 or above indicate critical collinearity issues among the indicators of formatively measured constructs. However, collinearity issues can also occur at lower VIF values of 3 (Mason and Perreault, 1991; Becker et al., 2015). Ideally, the VIF values should be close to 3 and lower.
方差膨胀因子(VIF)通常用于评估形成性指标的共线性。VIF 值为 5 或以上表明形成性测量构念的指标之间存在严重的共线性问题。然而,在 VIF 值为 3 时(Mason 和 Perreault,1991;Becker 等,2015)也可能出现共线性问题。理想情况下,VIF 值应接近 3 并且更低。
In the third and final step, researchers need to assess the indicator weights’ statistical significance and relevance (i.e. size). PLS-SEM is a nonparametric method and therefore, bootstrapping is used to determine statistical significance (Chin, 1998). Hair et al. (2017a) suggest using BCa bootstrap confidence intervals for significance testing in case the bootstrap distribution of the indicator weights is skewed. Otherwise, researchers should use the percentile method to construct bootstrap-based confidence intervals (Aguirre-Urreta and Rönkkö, 2018). If the confidence interval of an indicator weight includes zero, this indicates that the weight is not statistically significant and the indicator should be considered for removal from the measurement model. However, if an indicator weight is not significant, it is not necessarily interpreted as evidence of poor measurement model quality. Instead, the indicator’s absolute contribution to the construct is considered (Cenfetelli and Bassellier, 2009), as defined by its outer loading (i.e. the bivariate correlation between the indicator and its construct). According to Hair et al. (2017a), indicators with a nonsignificant weight should definitely be eliminated if the loading is also not significant. A low but significant loading of 0.50 and below suggests that one should consider deleting the indicator, unless there is strong support for its inclusion on the grounds of measurement theory.
在第三个也是最后一个步骤中,研究人员需要评估指标权重的统计显著性和相关性(即大小)。PLS-SEM 是一种非参数方法,因此使用自助法来确定统计显著性(Chin, 1998)。Hair 等(2017a)建议在指标权重的自助分布偏斜的情况下使用 BCa 自助置信区间进行显著性检验。否则,研究人员应使用百分位法构建基于自助法的置信区间(Aguirre-Urreta 和 Rönkkö, 2018)。如果某个指标权重的置信区间包含零,这表明该权重在统计上不显著,且该指标应考虑从测量模型中移除。然而,如果某个指标权重不显著,并不一定被解释为测量模型质量差的证据。相反,应考虑该指标对构念的绝对贡献(Cenfetelli 和 Bassellier, 2009),其由外部载荷定义(即指标与其构念之间的双变量相关性)。根据 Hair 等的研究。 (2017a),如果载荷也不显著,则权重不显著的指标应当被删除。低但显著的载荷为 0.50 及以下表明应考虑删除该指标,除非有强有力的测量理论支持其包含。
When deciding whether to delete formative indicators based on statistical outcomes, researchers need to be cautious for the following reasons. First, formative indicator weights are a function of the number of indicators used to measure a construct. The greater the number of indicators, the lower their average weight. Formative measurement models are, therefore, inherently limited in the number of indicator weights that can be statistically significant (Cenfetelli and Bassellier, 2009). Second, indicators should seldom be removed from formative measurement models, as formative measurement theory requires the indicators to fully capture the entire domain of a construct, as defined by the researcher in the conceptualization stage. In contrast to reflective measurement models, formative indicators are not interchangeable and removing even a single indicator can therefore, reduce the measurement model’s content validity (Diamantopoulos and Winklhofer, 2001).
在根据统计结果决定是否删除形成性指标时,研究人员需要谨慎,原因如下。首先,形成性指标的权重是用于测量构念的指标数量的函数。指标数量越多,其平均权重越低。因此,形成性测量模型在统计上显著的指标权重数量上固有地受到限制(Cenfetelli 和 Bassellier,2009)。其次,指标不应轻易从形成性测量模型中删除,因为形成性测量理论要求指标能够充分捕捉研究者在概念化阶段定义的构念的整个领域。与反射测量模型不同,形成性指标不可互换,因此删除任何单一指标都可能降低测量模型的内容效度(Diamantopoulos 和 Winklhofer,2001)。
After assessing the statistical significance of the indicator weights, researchers need to examine each indicator’s relevance. The indicator weights are standardized to values between −1 and +1, but, in rare cases can also take values lower or higher than this, which indicates an abnormal result (e.g. due to collinearity issues and/or small sample sizes). A weight close to 0 indicates a weak relationship, whereas weights close to +1 (or −1) indicate strong positive (or negative) relationships.
在评估指标权重的统计显著性后,研究人员需要检查每个指标的相关性。指标权重被标准化为介于−1 和+1 之间的值,但在少数情况下也可以取低于或高于此范围的值,这表明结果异常(例如,由于共线性问题和/或样本量小)。接近 0 的权重表示关系较弱,而接近+1(或−1)的权重则表示强正(或负)关系。
Assessing structural models
评估结构模型
When the measurement model assessment is satisfactory, the next step in evaluating PLS-SEM results is assessing the structural model. Standard assessment criteria, which should be considered, include the coefficient of determination (R2), the blindfolding-based cross-validated redundancy measure Q2, and the statistical significance and relevance of the path coefficients. In addition, researchers should assess their model’s out-of-sample predictive power by using the PLSpredict procedure (Shmueli et al., 2016).
当测量模型评估令人满意时,评估 PLS-SEM 结果的下一步是评估结构模型。应考虑的标准评估指标包括决定系数(R 2 )、基于盲法的交叉验证冗余度测量 Q 2 以及路径系数的统计显著性和相关性。此外,研究人员应通过使用 PLSpredict 程序(Shmueli 等,2016)评估其模型的样本外预测能力。
Structural model coefficients for the relationships between the constructs are derived from estimating a series of regression equations. Before assessing the structural relationships, collinearity must be examined to make sure it does not bias the regression results. This process is similar to assessing formative measurement models, but the latent variable scores of the predictor constructs in a partial regression are used to calculate the VIF values. VIF values above 5 are indicative of probable collinearity issues among the predictor constructs, but collinearity problems can also occur at lower VIF values of 3-5 (Mason and Perreault, 1991; Becker et al., 2015). Ideally, the VIF values should be close to 3 and lower. If collinearity is a problem, a frequently used option is to create higher-order models that can be supported by theory (Hair et al., 2017a).
结构模型系数是通过估计一系列回归方程得出的。在评估结构关系之前,必须检查共线性,以确保它不会偏倚回归结果。这个过程类似于评估形成性测量模型,但在部分回归中,预测构念的潜变量得分用于计算 VIF 值。VIF 值超过 5 表明预测构念之间可能存在共线性问题,但在 VIF 值为 3-5 的较低范围内也可能出现共线性问题(Mason 和 Perreault,1991;Becker 等,2015)。理想情况下,VIF 值应接近 3 或更低。如果共线性是一个问题,常用的选项是创建可以得到理论支持的高阶模型(Hair 等,2017a)。
If collinearity is not an issue, the next step is examining the R2 value of the endogenous construct(s). The R2 measures the variance, which is explained in each of the endogenous constructs and is therefore a measure of the model’s explanatory power (Shmueli and Koppius, 2011). The R2 is also referred to as in-sample predictive power (Rigdon, 2012). The R2 ranges from 0 to 1, with higher values indicating a greater explanatory power. As a guideline, R2 values of 0.75, 0.50 and 0.25 can be considered substantial, moderate and weak (Henseler et al., 2009; Hair et al., 2011). Acceptable R2 values are based on the context and in some disciplines an R2 value as low as 0.10 is considered satisfactory, for example, when predicting stock returns (Raithel et al., 2012). More importantly, the R2 is a function of the number of predictor constructs – the greater the number of predictor constructs, the higher the R2. Therefore, the R2 should always be interpreted in relation to the context of the study, based on the R2 values from related studies and models of similar complexity. R2 values can also be too high when the model overfits the data. That is, the partial regression model is too complex, which results in fitting the random noise inherent in the sample rather than reflecting the overall population. The same model would likely not fit on another sample drawn from the same population (Sharma et al., 2019a). When measuring a concept that is inherently predictable, such as physical processes, R2 values of 0.90 might be plausible. Similar R2 value levels in a model that predicts human attitudes, perceptions and intentions likely indicate an overfit.
如果共线性不是问题,下一步是检查内生构念的 R 2 值。R 2 衡量的是每个内生构念中解释的方差,因此是模型解释力的一个指标(Shmueli 和 Koppius,2011)。R 2 也被称为样本内预测能力(Rigdon,2012)。R 2 的范围从 0 到 1,值越高表示解释力越强。作为指导,R 2 值为 0.75、0.50 和 0.25 可以被视为显著、中等和弱(Henseler 等,2009;Hair 等,2011)。可接受的 R 2 值取决于具体情况,在某些学科中,R 2 值低至 0.10 也被认为是令人满意的,例如在预测股票收益时(Raithel 等,2012)。更重要的是,R 2 是预测构念数量的函数——预测构念数量越多,R 2 越高。因此,R 2 应始终结合研究的背景进行解释,基于相关研究和类似复杂度模型的 R 2 值。 R 2 值在模型过拟合数据时也可能过高。也就是说,部分回归模型过于复杂,导致拟合样本中固有的随机噪声,而不是反映整体人群。同样的模型在从同一人群中抽取的另一个样本上可能不适用(Sharma et al., 2019a)。在测量本质上可预测的概念时,例如物理过程,R 2 值为 0.90 可能是合理的。在预测人类态度、感知和意图的模型中,类似的 R 2 值水平可能表明过拟合。
Researchers can also assess how the removal of a certain predictor construct affects an endogenous construct’s R2 value. This metric is the f2 effect size and is somewhat redundant to the size of the path coefficients. More precisely, the rank order of the predictor constructs’ relevance in explaining a dependent construct in the structural model is often the same when comparing the size of the path coefficients and the f2 effect sizes. In such situations, the f2 effect size should only be reported if requested by editors or reviewers. If the rank order of the constructs’ relevance, when explaining a dependent construct in the structural model, differs when comparing the size of the path coefficients and the f2 effect sizes, the researcher may report the f2 effect size to explain the presence of, for example, partial or full mediation (Nitzl et al., 2016). As a rule of thumb, values higher than 0.02, 0.15 and 0.35 depict small, medium and large f2 effect sizes (Cohen, 1988).
研究人员还可以评估某个预测构念的移除如何影响内生构念的 R 2 值。该指标是 f 2 效应大小,与路径系数的大小有些冗余。更准确地说,在比较路径系数的大小和 f 2 效应大小时,预测构念在解释结构模型中因变量构念的相关性排名顺序通常是相同的。在这种情况下,只有在编辑或审稿人要求时,才应报告 f 2 效应大小。如果在解释结构模型中因变量构念时,构念的相关性排名顺序在比较路径系数的大小和 f 2 效应大小时有所不同,研究人员可以报告 f 2 效应大小,以解释例如部分或完全中介的存在(Nitzl et al., 2016)。作为经验法则,值高于 0.02、0.15 和 0.35 分别表示小、中和大 f 2 效应大小(Cohen, 1988)。
Another means to assess the PLS path model’s predictive accuracy is by calculating the Q2 value (Geisser, 1974; Stone, 1974). This metric is based on the blindfolding procedure that removes single points in the data matrix, imputes the removed points with the mean and estimates the model parameters (Rigdon, 2014b; Sarstedt et al., 2014). As such, the Q2 is not a measure of out-of-sample prediction, but rather combines aspects of out-of-sample prediction and in-sample explanatory power (Shmueli et al., 2016; Sarstedt et al., 2017a). Using these estimates as input, the blindfolding procedure predicts the data points that were removed for all variables. Small differences between the predicted and the original values translate into a higher Q2 value, thereby indicating a higher predictive accuracy. As a guideline, Q2 values should be larger than zero for a specific endogenous construct to indicate predictive accuracy of the structural model for that construct. As a rule of thumb, Q2 values higher than 0, 0.25 and 0.50 depict small, medium and large predictive relevance of the PLS-path model. Similar to the f2 effect sizes, it is possible to compute and interpret the q2 effect sizes.
评估 PLS 路径模型预测准确性的另一种方法是计算 Q 2 值(Geisser,1974;Stone,1974)。该指标基于盲 folding 程序,该程序在数据矩阵中移除单个点,用均值填补被移除的点,并估计模型参数(Rigdon,2014b;Sarstedt 等,2014)。因此,Q 2 并不是样本外预测的度量,而是结合了样本外预测和样本内解释能力的各个方面(Shmueli 等,2016;Sarstedt 等,2017a)。使用这些估计作为输入,盲 folding 程序预测所有变量中被移除的数据点。预测值与原始值之间的小差异转化为更高的 Q 2 值,从而表明更高的预测准确性。作为指导,Q 2 值应大于零,以指示该特定内生构造的结构模型的预测准确性。作为经验法则,Q 2 值高于 0、0.25 和 0.50 分别表示 PLS 路径模型的小、中和大预测相关性。 与 f 2 效应量类似,可以计算和解释 q 2 效应量。
Many researchers interpret the R2 statistic as a measure of their model’s predictive power. This interpretation is not entirely correct, however, as the R2 only indicates the model’s in-sample explanatory power – it says nothing about the model’s out-of-sample predictive power (Shmueli, 2010; Shmueli and Koppius, 2011; Dolce et al., 2017). Addressing this concern, Shmueli et al. (2016) proposed a set of procedures for out-of-sample prediction that involves estimating the model on an analysis (i.e. training) sample and evaluating its predictive performance on data other than the analysis sample, referred to as a holdout sample. Their PLSpredict procedure generates holdout sample-based predictions in PLS-SEM and is an option in PLS-SEM software, such as SmartPLS (Ringle et al., 2015) and open source environments such as R (https://github.com/ISS-Analytics/pls-predict), so that researchers can easily apply the procedure.
许多研究人员将 R 2 统计量解释为其模型预测能力的衡量标准。然而,这种解释并不完全正确,因为 R 2 仅表示模型的样本内解释能力——它并未说明模型的样本外预测能力(Shmueli, 2010; Shmueli 和 Koppius, 2011; Dolce 等, 2017)。为了解决这一问题,Shmueli 等(2016)提出了一套样本外预测的程序,涉及在分析(即训练)样本上估计模型,并在与分析样本不同的数据上评估其预测性能,这些数据被称为保留样本。他们的 PLSpredict 程序在 PLS-SEM 中生成基于保留样本的预测,并且是 PLS-SEM 软件中的一个选项,例如 SmartPLS(Ringle 等, 2015)和开源环境如 R(https://github.com/ISS-Analytics/pls-predict),以便研究人员可以轻松应用该程序。
PLSpredict executes k-fold cross-validation. A fold is a subgroup of the total sample and k is the number of subgroups. That is, the total data set is randomly split into k equally sized subsets of data. For example, a cross-validation based on k = 5 folds splits the sample into five equally sized data subsets (i.e. groups of data). PLSpredict then combines k − 1 subsets into a single analysis sample that is used to predict the remaining fifth data subset. The fifth data subset is the holdout sample for the first cross-validation run. This cross-validation process is then repeated k times (in this example, five times), with each of the five subsets used once as the holdout sample. Thus, each case in every holdout sample has a predicted value estimated with a sample in which that case was not used to estimate the model parameters. Shmueli et al. (2019) recommend setting k = 10, but researchers need to make sure the analysis sample for each subset (fold) meets minimum sample size guidelines. Also, other criteria to assess out-of-sample prediction without using a holdout sample are available, such as the Bayesian information criterion (BIC) and Geweke and Meese (GM) criterion (discussed later in this paper).
PLSpredict 执行 k 折交叉验证。一个折是总样本的一个子组,k 是子组的数量。也就是说,总数据集被随机分成 k 个大小相等的数据子集。例如,基于 k = 5 折的交叉验证将样本分成五个大小相等的数据子集(即数据组)。然后,PLSpredict 将 k - 1 个子集组合成一个单一的分析样本,用于预测剩余的第五个数据子集。第五个数据子集是第一次交叉验证运行的保留样本。这个交叉验证过程会重复 k 次(在这个例子中是五次),每个五个子集各使用一次作为保留样本。因此,每个保留样本中的每个案例都有一个预测值,该预测值是用未用于估计模型参数的样本来估计的。Shmueli 等人(2019)建议设置 k = 10,但研究人员需要确保每个子集(折)的分析样本满足最低样本量指南。 此外,还有其他标准可以评估样本外预测而不使用保留样本,例如贝叶斯信息准则(BIC)和 Geweke 与 Meese(GM)准则(将在本文后面讨论)。
The generation of the k subgroups is a random process and can sometimes result in extreme partitions that potentially lead to abnormal solutions. To avoid such abnormal solutions, researchers should run PLSpredict multiple times. Shmueli et al. (2019) recommend to generally run the procedure ten times. However, when the objective is to duplicate how the PLS model will eventually be used to predict a new observation by using a single model (estimated from the entire data set), PLSpredict should be run only once (i.e. without repetitions).
k 个子群的生成是一个随机过程,有时可能导致极端的划分,从而导致异常解。为了避免这种异常解,研究人员应多次运行 PLSpredict。Shmueli 等人(2019)建议一般运行该程序十次。然而,当目标是复制 PLS 模型如何最终用于通过使用单一模型(从整个数据集中估计)来预测新观察时,PLSpredict 应仅运行一次(即不重复)。
For the PLSpredict based assessment of a model’s predictive power, researchers can draw on several prediction statistics that quantify the amount of prediction error. For example, the mean absolute error (MAE) measures the average magnitude of the errors in a set of predictions without considering their direction (over or under). The MAE is thus the average absolute differences between the predictions and the actual observations, with all the individual differences having equal weight. Another popular prediction metric is the root mean squared error (RMSE), which is defined as the square root of the average of the squared differences between the predictions and the actual observations. As the RMSE squares the errors before averaging, the statistic assigns a greater weight to larger errors, which makes it particularly useful when large errors are undesirable – as is typically the case in business research applications.
对于基于 PLSpredict 的模型预测能力评估,研究人员可以借助几种预测统计量来量化预测误差的大小。例如,平均绝对误差(MAE)衡量一组预测中误差的平均大小,而不考虑其方向(过高或过低)。因此,MAE 是预测值与实际观察值之间的平均绝对差异,所有个体差异具有相等的权重。另一个常用的预测指标是均方根误差(RMSE),其定义为预测值与实际观察值之间平方差异的平均值的平方根。由于 RMSE 在平均之前对误差进行平方,因此该统计量对较大误差赋予更大的权重,这使得它在大误差不可取的情况下特别有用——这在商业研究应用中通常是这样的情况。
When interpreting PLSpredict results, the focus should be on the model’s key endogenous construct, as opposed to examining the prediction errors for all endogenous constructs’ indicators. When the key target construct has been selected, the
statistic should be evaluated first to verify if the predictions outperform the most naïve benchmark, defined as the indicator means from the analysis sample (Shmueli et al., 2019). Then, researchers need to examine the prediction statistics. In most instances, researchers should use the RMSE. If the prediction error distribution is highly non-symmetric, the MAE is the more appropriate prediction statistic (Shmueli et al., 2019). The prediction statistics depend on the indicators’ measurement scales and their raw values do not carry much meaning. Therefore, researchers need to compare the RMSE (or MAE) values with a naïve benchmark. The recommended naïve benchmark (produced by the PLSpredict method) uses a linear regression model (LM) to generate predictions for the manifest variables, by running a linear regression of each of the dependent construct’s indicators on the indicators of the exogenous latent variables in the PLS path model (Danks and Ray, 2018). When comparing the RMSE (or MAE) values with the LM values, the following guidelines apply (Shmueli et al., 2019):
在解释 PLSpredict 结果时,重点应放在模型的关键内生构念上,而不是检查所有内生构念指标的预测误差。当选择了关键目标构念后,应首先评估 统计量,以验证预测是否优于最简单的基准,该基准定义为分析样本的指标均值(Shmueli 等,2019)。然后,研究人员需要检查预测统计数据。在大多数情况下,研究人员应使用 RMSE。如果预测误差分布高度不对称,则 MAE 是更合适的预测统计量(Shmueli 等,2019)。预测统计量依赖于指标的测量尺度,其原始值并没有太大意义。因此,研究人员需要将 RMSE(或 MAE)值与简单基准进行比较。 推荐的简单基准(由 PLSpredict 方法生成)使用线性回归模型(LM)为显性变量生成预测,通过对 PLS 路径模型中每个因变量的指标与外生潜变量的指标进行线性回归(Danks 和 Ray,2018)。在将 RMSE(或 MAE)值与 LM 值进行比较时,适用以下指导原则(Shmueli 等)。, 2019):
If the PLS-SEM analysis, compared to the naïve LM benchmark, yields higher prediction errors in terms of RMSE (or MAE) for all indicators, this indicates that the model lacks predictive power.
如果 PLS-SEM 分析与简单线性模型基准相比,在所有指标上产生更高的 RMSE(或 MAE)预测误差,这表明该模型缺乏预测能力。If the majority of the dependent construct indicators in the PLS-SEM analysis produce higher prediction errors compared to the naïve LM benchmark, this indicates that the model has a low predictive power.
如果 PLS-SEM 分析中依赖构念指标的大多数预测误差高于简单线性模型基准,这表明该模型的预测能力较低。If the minority (or the same number) of indicators in the PLS-SEM analysis yields higher prediction errors compared to the naïve LM benchmark, this indicates a medium predictive power.
如果 PLS-SEM 分析中少数(或相同数量)的指标产生的预测误差高于简单线性模型基准,这表明具有中等的预测能力。If none of the indicators in the PLS-SEM analysis has higher RMSE (or MAE) values compared to the naïve LM benchmark, the model has high predictive power.
如果 PLS-SEM 分析中的任何指标的 RMSE(或 MAE)值都没有超过简单线性模型基准,则该模型具有较高的预测能力。
Having substantiated the model’s explanatory power and predictive power, the final step is to assess the statistical significance and relevance of the path coefficients. The interpretation of the path coefficients parallels that of the formative indicator weights. That is, researchers need to run bootstrapping to assess the path coefficients’ significance and evaluate their values, which typically fall in the range of −1 and +1. Also, they can interpret a construct’s indirect effect on a certain target construct via one or more intervening constructs. This effect type is particularly relevant in the assessment of mediating effects (Nitzl, 2016).
在验证了模型的解释力和预测力之后,最后一步是评估路径系数的统计显著性和相关性。路径系数的解释与形成性指标权重的解释相似。也就是说,研究人员需要进行自助法(bootstrapping)来评估路径系数的显著性并评估其值,这些值通常在−1 和+1 之间。此外,他们可以通过一个或多个中介构念来解释某个构念对特定目标构念的间接影响。这种影响类型在评估中介效应时尤为相关(Nitzl, 2016)。
Similarly, researchers can interpret a construct’s total effect, defined as the sum of the direct and all indirect effects. A model’s total effects also serve as input for the importance-performance map analysis (IPMA) and extend the standard PLS-SEM results reporting of path coefficient estimates by adding a dimension to the analysis that considers the average values of the latent variable scores. More precisely, the IPMA compares the structural model’s total effects on a specific target construct with the average latent variable scores of this construct’s predecessors (Ringle and Sarstedt, 2016).
同样,研究人员可以解释一个构念的总效应,定义为直接效应和所有间接效应的总和。模型的总效应也作为重要性-表现图分析(IPMA)的输入,并通过增加一个维度来扩展标准 PLS-SEM 路径系数估计的结果报告,该维度考虑潜在变量分数的平均值。更准确地说,IPMA 将结构模型对特定目标构念的总效应与该构念前驱的潜在变量分数的平均值进行比较(Ringle 和 Sarstedt,2016)。
Finally, researchers may be interested in comparing different model configurations resulting from different theories or research contexts. Sharma et al. (2019b, 2019a) recently compared the efficacy of various metrics for model comparison tasks and found that Schwarz’s (1978) BIC and Geweke and Meese’s (1981) GM achieve a sound trade-off between model fit and predictive power in the estimation of PLS path models. Their research facilitates assessing out-of-sample prediction without using a holdout sample, and is particularly useful with PLS-SEM applications based on a sample that is too small to divide it into useful analysis and holdout samples. Specifically, researchers should estimate each model separately and select the model that minimizes the value in BIC or GM for a certain target construct. For example, a model that produces a BIC value of −270 should be preferred over a model that produces a BIC value of −150. Table I summarizes the metrics that need to be applied when interpreting and reporting PLS-SEM results.
最后,研究人员可能会对比较不同理论或研究背景下产生的不同模型配置感兴趣。Sharma 等人(2019b,2019a)最近比较了各种模型比较任务的指标的有效性,发现 Schwarz(1978)的 BIC 和 Geweke 与 Meese(1981)的 GM 在 PLS 路径模型的估计中实现了模型拟合与预测能力之间的良好权衡。他们的研究促进了在不使用保留样本的情况下评估样本外预测,特别适用于基于样本过小而无法将其划分为有用分析和保留样本的 PLS-SEM 应用。具体而言,研究人员应分别估计每个模型,并选择在某个目标构念下最小化 BIC 或 GM 值的模型。例如,产生 BIC 值为−270 的模型应优于产生 BIC 值为−150 的模型。表 I 总结了在解释和报告 PLS-SEM 结果时需要应用的指标。
Robustness checks 稳健性检验
Recent research has proposed complementary methods for assessing the robustness of PLS-SEM results (Hair et al., 2018; Latan, 2018). These methods address either the measurement model or the structural model (Table I).
最近的研究提出了评估 PLS-SEM 结果稳健性的补充方法(Hair 等,2018;Latan,2018)。这些方法针对测量模型或结构模型(见表 I)。
In terms of measurement models, Gudergan et al. (2008) have proposed the confirmatory tetrad analysis (CTA-PLS), which enables empirically substantiating the specification of measurement models (i.e. reflective versus formative). The CTA-PLS relies on the concept of tetrads that describe the difference of the product of one pair of covariances and the product of another pair of covariances (Bollen and Ting, 2000). In a reflective measurement model, these tetrads should vanish (i.e. they become zero) as the indicators are assumed to stem from the same domain. If one of a construct’s tetrads is significantly different from 0, one rejects the null hypothesis and assumes a formative instead of a reflective measurement model specification. It should be noted, however, that CTA-PLS is an empirical test of measurement models and the primary method to determine reflective or formative model specification is theoretical reasoning (Hair et al., 2017a).
在测量模型方面,Gudergan 等人(2008)提出了确认性四元分析(CTA-PLS),该方法能够实证验证测量模型的规范(即反射性与形成性)。CTA-PLS 依赖于四元组的概念,该概念描述了一对协方差的乘积与另一对协方差的乘积之间的差异(Bollen 和 Ting,2000)。在反射性测量模型中,这些四元组应该消失(即变为零),因为假设指标来自同一领域。如果一个构念的四元组显著不同于 0,则拒绝原假设,并假设形成性而非反射性测量模型规范。然而,需要注意的是,CTA-PLS 是对测量模型的实证检验,而确定反射性或形成性模型规范的主要方法是理论推理(Hair 等,2017a)。
In terms of the structural model, Sarstedt et al. (2019) suggest that researchers should consider nonlinear effects, endogeneity and unobserved heterogeneity. First, to test whether relationships are nonlinear, researchers can run Ramsey’s (1969) regression equation specification error test on the latent variable scores in the path model’s partial regressions. A significant test statistic in any of the partial regressions indicates a potential nonlinear effect. In addition, researchers can establish an interaction term to map a nonlinear effect in the model and test its statistical significance using bootstrapping (Svensson et al., 2018).
在结构模型方面,Sarstedt 等人(2019)建议研究人员应考虑非线性效应、内生性和未观察到的异质性。首先,为了检验关系是否为非线性,研究人员可以对路径模型的部分回归中的潜变量得分进行 Ramsey(1969)回归方程规范性错误检验。任何部分回归中显著的检验统计量表明可能存在非线性效应。此外,研究人员可以建立交互项以映射模型中的非线性效应,并使用自助法检验其统计显著性(Svensson 等人,2018)。
Second, when the research perspective is primarily explanatory in a PLS-SEM analysis, researchers should test for endogeneity. Endogeneity typically occurs when researchers have omitted a construct that correlates with one or more predictor constructs and the dependent construct in a partial regression of the PLS path model. To assess and treat endogeneity, researchers should follow Hult et al.’s (2018) systematic procedure, starting with the application of Park and Gupta’s (2012) Gaussian copula approach. If the approach indicates an endogeneity issue, researchers should implement instrumental variables that are highly correlated with the independent constructs, but are uncorrelated with the dependent construct’s error term to explain the sources of endogeneity (Bascle, 2008). Importantly, however, endogeneity assessment is only relevant when the researcher’s focus is on explanation and rather not when following causal-predictive goals.
其次,当研究视角主要是解释性的 PLS-SEM 分析时,研究人员应测试内生性。内生性通常发生在研究人员遗漏了一个与一个或多个预测构造以及 PLS 路径模型中因变量构造相关的构造时。为了评估和处理内生性,研究人员应遵循 Hult 等人(2018)的系统程序,从应用 Park 和 Gupta(2012)的高斯 copula 方法开始。如果该方法表明存在内生性问题,研究人员应实施与独立构造高度相关但与因变量构造的误差项不相关的工具变量,以解释内生性的来源(Bascle,2008)。然而,重要的是,内生性评估仅在研究者关注解释时相关,而不是在追求因果预测目标时。
Third, unobserved heterogeneity occurs when subgroups of data exist that produce substantially different model estimates. If this is the case, estimating the model based on the entire data set is very likely to produce misleading results (Becker et al., 2013). Hence, any PLS-SEM analysis should include a routine check for unobserved heterogeneity to ascertain whether or not the analysis of the entire data set is reasonable or not. Sarstedt et al. (2017b) proposed a systematic procedure for identifying and treating unobserved heterogeneity. Using information criteria derived from a finite mixture PLS (Hahn et al., 2002; Sarstedt et al., 2011), researchers can identify the number of segments to be extracted from the data (if any) (Hair et al., 2016; Matthews et al., 2016). If heterogeneity is present at a critical level, the next step involves running the PLSprediction-oriented segmentation procedure (Becker et al., 2013) to disclose the data’s segment structure. Finally, researchers should attempt to identify suitable explanatory variables that characterize the uncovered segments (e.g. by using contingency table or exhaustive CHAID analyses; Ringle et al., 2010). If suitable explanatory variables are available, a moderator (Henseler and Fassott, 2010; Becker et al., 2018) or multigroup analysis (Chin and Dibbern, 2010; Matthews, 2017), in combination with a measurement invariance assessment (Henseler et al., 2016b), offers further particularized findings, conclusions and implications.
第三,未观察到的异质性发生在存在子数据组时,这些子数据组产生的模型估计显著不同。如果是这种情况,基于整个数据集估计模型很可能会产生误导性结果(Becker et al., 2013)。因此,任何 PLS-SEM 分析都应包括对未观察到的异质性的常规检查,以确定对整个数据集的分析是否合理。Sarstedt 等(2017b)提出了一种系统程序,用于识别和处理未观察到的异质性。通过使用来自有限混合 PLS 的信息标准(Hahn et al., 2002;Sarstedt et al., 2011),研究人员可以识别从数据中提取的段数(如果有的话)(Hair et al., 2016;Matthews et al., 2016)。如果在关键水平上存在异质性,下一步涉及运行以 PLS 预测为导向的分段程序(Becker et al., 2013),以揭示数据的段结构。最后,研究人员应尝试识别适合的解释变量,以表征所揭示的段(例如,通过使用列联表或详尽的 CHAID 分析;Ringle et al., 2010)。 如果有合适的解释变量,调节变量(Henseler 和 Fassott,2010;Becker 等,2018)或多组分析(Chin 和 Dibbern,2010;Matthews,2017),结合测量不变性评估(Henseler 等,2016b),可以提供更为具体的发现、结论和启示。
Concluding observations 结论性观察
PLS-SEM is increasingly being applied to estimate structural equation models (Hair et al., 2014). Scholars need a comprehensive, yet concise, overview of the considerations and metrics needed to ensure their analysis and reporting of PLS-SEM results is complete – before submitting their article for review. Prior research has provided such reporting guidelines (Hair et al., 2011; Hair et al., 2013; Hair et al., 2012b; Chin, 2010; Tenenhaus et al., 2005; Henseler et al., 2009), which, in light of more recent research and methodological developments in the PLS-SEM domain, need to be continuously extended and updated. We hope this paper achieves this goal.
PLS-SEM 正在越来越多地应用于估计结构方程模型(Hair et al., 2014)。学者们需要一个全面而简明的概述,以确保他们在提交文章进行审稿之前,对 PLS-SEM 结果的分析和报告是完整的。先前的研究提供了这样的报告指南(Hair et al., 2011;Hair et al., 2013;Hair et al., 2012b;Chin, 2010;Tenenhaus et al., 2005;Henseler et al., 2009),这些指南需要根据 PLS-SEM 领域的最新研究和方法论发展不断扩展和更新。我们希望本文能够实现这一目标。
For researchers who have not used PLS-SEM in the past, this article is a good point of orientation on when preparing and finalizing their manuscripts. Moreover, for researchers experienced in applying PLS-SEM, this is a good overview and reminder of how to prepare PLS-SEM manuscripts. This knowledge is also important for reviewers and journal editors to ensure the rigor of published PLS-SEM studies. We provide an overview of several recently proposed improvements (PLSpredict and model comparison metrics), as well as complementary methods for robustness checks (e.g. endogeneity assessment and latent class procedures), which we recommend should be applied – if appropriate – when using PLS-SEM. Finally, while a few researchers have published articles that are negative about the use of PLS-SEM, more recently several prominent researchers have acknowledged the value of PLS as an SEM technique (Petter, 2018). We believe that social science scholars would be remiss if they did not apply all statistical methods at their disposal to explore and better understand the phenomena they are researching.
对于那些过去没有使用 PLS-SEM 的研究人员来说,本文是一个很好的方向指引,帮助他们在准备和最终定稿手稿时。此外,对于有 PLS-SEM 应用经验的研究人员来说,这也是一个很好的概述和提醒,说明如何准备 PLS-SEM 手稿。这些知识对于审稿人和期刊编辑也很重要,以确保已发表的 PLS-SEM 研究的严谨性。我们提供了几项最近提出的改进(PLSpredict 和模型比较指标)的概述,以及用于稳健性检验的补充方法(例如内生性评估和潜在类别程序),我们建议在使用 PLS-SEM 时应适当应用这些方法。最后,尽管一些研究人员发表了对 PLS-SEM 使用持负面看法的文章,但最近几位知名研究人员已承认 PLS 作为一种 SEM 技术的价值(Petter, 2018)。我们认为,社会科学学者如果不利用所有可用的统计方法来探索和更好地理解他们所研究的现象,将会失职。
Figures 图表
Guidelines when using PLS-SEM
使用 PLS-SEM 时的指南
Reflective measurement models 反射测量模型 |
|
Reflective indicator loadings 反射性指标负载 |
≥0.708 |
Internal consistency reliability 内部一致性可靠性 |
Cronbach’s alpha is the lower bound, the composite reliability is the upper bound for internal consistency reliability. ρA usually lies between these bounds and may serve as a good representation of a construct’s internal consistency reliability, assuming that the factor model is correct 克朗巴赫α系数是内部一致性可靠性的下限,复合可靠性是其上限。ρ A 通常位于这两个界限之间,并且在假设因子模型正确的情况下,可以很好地代表一个构念的内部一致性可靠性。 Minimum 0.70 (or 0.60 in exploratory research) 最低 0.70(或在探索性研究中为 0.60) Maximum of 0.95 to avoid indicator redundancy, which would compromise content validity 最大值为 0.95,以避免指标冗余,这会影响内容效度。 Recommended 0.70-0.90 推荐值 0.70-0.90 Test if the internal consistency reliability is significantly higher (lower) than the recommended minimum (maximum) thresholds. Use the percentile method to construct the bootstrap-based confidence interval; in case of a skewed bootstrap distribution, use the BCa method 测试内部一致性可靠性是否显著高于(低于)推荐的最低(最高)阈值。使用百分位数方法构建基于自助法的置信区间;如果自助法分布偏斜,则使用 BCa 方法。 |
Convergent validity | AVE ≥ 0.50 |
Discriminant validity | For conceptually similar constructs: HTMT < 0.90 For conceptually different constructs: HTMT < 0.85 Test if the HTMT is significantly lower than the threshold value |
Formative measurement models | |
Convergent validity (redundancy analysis) | ≥0.70 correlation |
Collinearity (VIF) | Probable (i.e. critical) collinearity issues when VIF ≥ 5 Possible collinearity issues when VIF ≥ 3-5 Ideally show that VIF < 3 |
Statistical significance of weights | p-value < 0.05 or the 95% confidence interval (based on the percentile method or, in case of a skewed bootstrap distribution, the BCa method) does not include zero |
Relevance of indicators with a significant weight | Larger significant weights are more relevant (contribute more) |
Relevance of indicators with a non-significant weight | Loadings of ≥0.50 that are statistically significant are considered relevant |
Structural model | |
Collinearity (VIF) | Probable (i.e. critical) collinearity issues when VIF ≥ 5 Possible collinearity issues when VIF ≥ 3-5 Ideally show that VIF < 3 |
R2 value | R2 values of 0.75, 0.50 and 0.25 are considered substantial, moderate and weak. R2 values of 0.90 and higher are typical indicative of overfit |
Q2 value | Values larger than zero are meaningful Values higher than 0, 0.25 and 0.50 depict small, medium and large predictive accuracy of the PLS path model |
PLSpredict | Set k = 10, assuming each subgroup meets the minimum required sample size Use ten repetitions, assuming the sample size is large enough values > 0 indicate that the model outperforms the most naïve benchmark (i.e. the indicator means from the analysis sample) Compare the MAE (or the RMSE) value with the LM value of each indicator. Check if the PLS-SEM analysis (compared to the LM) yields higher prediction errors in terms of RMSE (or MAE) for all (no predictive power), the majority (low predictive power), the minority or the same number (medium predictive power) or none of the indicators (high predictive power) |
Model comparisons | Select the model that minimizes the value in BIC or GM compared to the other models in the set |
Robustness checks | |
Measurement models | CTA-PLS |
Structural model | Nonlinear effects Endogeneity Unobserved heterogeneity |
References 参考文献
Aguirre-Urreta, M.I. and Rönkkö, M. (2018), “Statistical inference with PLSc using bootstrap confidence intervals”, MIS Quarterly, Vol. 42 No. 3, pp. 1001-1020.
Aguirre-Urreta, M.I. 和 Rönkkö, M. (2018),“使用自助法置信区间的 PLSc 统计推断”,MIS 季度刊,第 42 卷第 3 期,页码 1001-1020。
Akter, S., Fosso Wamba, S. and Dewan, S. (2017), “Why PLS-SEM is suitable for complex modelling? An empirical illustration in big data analytics quality”, Production Planning and Control, Vol. 28 Nos 11/12, pp. 1011-1021.
Akter, S., Fosso Wamba, S. 和 Dewan, S. (2017),“为什么 PLS-SEM 适合复杂建模?大数据分析质量的实证示例”,《生产计划与控制》,第 28 卷第 11/12 期,页码 1011-1021。
Ali, F., Rasoolimanesh, S.M. and Cobanoglu, C. (2018a), Applying Partial Least Squares in Tourism and Hospitality Research, Emerald, Bingley.
阿里,F.,拉苏利曼什,S.M. 和科巴诺格鲁,C.(2018a),在旅游与酒店研究中应用偏最小二乘法,艾默拉德,宾格利。
Ali, F., Rasoolimanesh, S.M., Sarstedt, M., Ringle, C.M. and Ryu, K. (2018b), “An assessment of the use of partial least squares structural equation modeling (PLS-SEM) in hospitality research”, International Journal of Contemporary Hospitality Management, Vol. 30 No. 1, pp. 514-538.
Avkiran, N.K. and Ringle, C.M. (2018), Partial Least Squares Structural Equation Modeling: Recent Advances in Banking and Finance, Springer International Publishing, Cham.
Bascle, G. (2008), “Controlling for endogeneity with instrumental variables in strategic management research”, Strategic Organization, Vol. 6 No. 3, pp. 285-327.
Becker, J.-M., Ringle, C.M. and Sarstedt, M. (2018), “Estimating moderating effects in PLS-SEM and PLSc-SEM: interaction term generation*data treatment”, Journal of Applied Structural Equation Modeling, Vol. 2 No. 2, pp. 1-21.
Becker, J.-M., Rai, A., Ringle, C.M. and Völckner, F. (2013), “Discovering unobserved heterogeneity in structural equation models to avert validity threats”, MIS Quarterly, Vol. 37 No. 3, pp. 665-694.
Becker, J.-M., Ringle, C.M., Sarstedt, M. and Völckner, F. (2015), “How collinearity affects mixture regression results”, Marketing Letters, Vol. 26 No. 4, pp. 643-659.
Bollen, K.A. and Ting, K.-F. (2000), “A tetrad test for causal indicators”, Psychological Methods, Vol. 5 No. 1, pp. 3-22.
Boomsma, A. and Hoogland, J.J., (2001), “The robustness of LISREL modeling revisited”, in Cudeck, R., du Toit, S. and Sörbom, D. (Eds) Structural Equation Modeling: Present and Future, Scientific Software International, Chicago. 139-168.
Cenfetelli, R.T. and Bassellier, G. (2009), “Interpretation of formative measurement in information systems research”, MIS Quarterly, Vol. 33 No. 4, pp. 689-708.
Cheah, J.-H., Sarstedt, M., Ringle, C.M., Ramayah, T. and Ting, H. (2018), “Convergent validity assessment of formatively measured constructs in PLS-SEM: on using single-item versus multi-item measures in redundancy analyses”, International Journal of Contemporary Hospitality Management, Vol. 30 No. 11, pp. 3192-3210.
Chin, W.W. (1998), “The partial least squares approach to structural equation modeling”, in Marcoulides, G.A. (Ed.), Modern Methods for Business Research, Mahwah, Erlbaum, pp. 295-358.
Chin, W.W. (2003), PLS-Graph 3.0, Soft Modeling Inc, Houston.
Chin, W.W. (2010), “How to write up and report PLS analyses”, in Esposito Vinzi, V., Chin, W.W., Henseler, J., et al. (Eds), Handbook of Partial Least Squares: Concepts, Methods and Applications (Springer Handbooks of Computational Statistics Series), Springer, Heidelberg, Dordrecht, London, New York, NY, Vol. II, pp. 655-690.
Chin, W.W. and Dibbern, J. (2010), “A permutation based procedure for multi-group PLS analysis: results of tests of differences on simulated data and a cross cultural analysis of the sourcing of information system services between Germany and the USA”, in Esposito Vinzi, V., Chin, W.W., Henseler J. and Wang, H. (Eds), Handbook of Partial Least Squares: Concepts, Methods and Applications (Springer Handbooks of Computational Statistics Series), Springer, Heidelberg, Dordrecht, London, New York, NY, Vol. II, pp. 171-193.
Chou, C.-P., Bentler, P.M. and Satorra, A. (1991), “Scaled test statistics and robust standard errors for Non-Normal data in covariance structure analysis: a monte carlo study”, British Journal of Mathematical and Statistical Psychology, Vol. 44 No. 2, pp. 347-357.
Cochran, W.G. (1977), Sampling Techniques, Wiley, New York, NY.
Cohen, J. (1988), Statistical Power Analysis for the Behavioral Sciences: Lawrence Erlbaum Associates.
Danks, N. and Ray, S. (2018), “Predictions from partial least squares models”, in Ali, F., Rasoolimanesh, S.M. and Cobanoglu, C. (Eds), Applying Partial Least Squares in Tourism and Hospitality Research, Emerald, Bingley, pp. 35-52.
Diamantopoulos, A., Sarstedt, M., Fuchs, C., Wilczynski, P. and Kaiser, S. (2012), “Guidelines for choosing between multi-item and single-item scales for construct measurement: a predictive validity perspective”, Journal of the Academy of Marketing Science, Vol. 40 No. 3, pp. 434-449.
Diamantopoulos, A. and Winklhofer, H.M. (2001), “Index construction with formative indicators: an alternative to scale development”, Journal of Marketing Research, Vol. 38 No. 2, pp. 269-277.
Dijkstra, T.K. and Henseler, J. (2015), “Consistent partial least squares path modeling”, MIS Quarterly, Vol. 39 No. 2, pp. 297-316.
do Valle, P.O. and Assaker, G. (2016), “Using partial least squares structural equation modeling in tourism research: a review of past research and recommendations for future applications”, Journal of Travel Research, Vol. 55 No. 6, pp. 695-708.
Dolce, P., Esposito Vinzi, V. and Lauro, C. (2017), “Predictive path modeling through PLS and other component-based approaches: methodological issues and performance evaluation”, in Latan, H. and Noonan, R. (Eds), Partial Least Squares Path Modeling: Basic Concepts, Methodological Issues and Applications, Springer International Publishing, Cham, pp. 153-172.
Drolet, A.L. and Morrison, D.G. (2001), “Do we really need multiple-item measures in service research?”, Journal of Service Research, Vol. 3 No. 3, pp. 196-204.
Efron, B. (1987), “Better bootstrap confidence intervals”, Journal of the American Statistical Association, Vol. 82 No. 397, pp. 171-185.
Fornell, C.G. and Bookstein, F.L. (1982), “Two structural equation models: LISREL and PLS applied to consumer exit-voice theory”, Journal of Marketing Research, Vol. 19 No. 4, pp. 440-452.
Fornell, C.G. and Larcker, D.F. (1981), “Evaluating structural equation models with unobservable variables and measurement error”, Journal of Marketing Research, Vol. 18 No. 1, pp. 39-50.
Franke, G.R. and Sarstedt, M. (2019), “Heuristics versus statistics in discriminant validity testing: a comparison of four procedures”, Internet Research, Forthcoming.
Garson, G.D. (2016), Partial Least Squares Regression and Structural Equation Models, Statistical Associates, Asheboro.
Geisser, S. (1974), “A predictive approach to the random effects model”, Biometrika, Vol. 61 No. 1, pp. 101-107.
Geweke, J. and Meese, R. (1981), “Estimating regression models of finite but unknown order”, International Economic Review, Vol. 22 No. 1, pp. 55-70.
Goodhue, D.L., Lewis, W. and Thompson, R. (2012), “Does PLS have advantages for small sample size or non-normal data?”, MIS Quarterly, Vol. 36 No. 3, pp. 981-1001.
Götz, O., Liehr-Gobbers, K. and Krafft, M. (2010), “Evaluation of structural equation models using the partial least squares (PLS) Approach”, in Esposito Vinzi, V., Chin, W.W., Henseler, J., et al. (Eds), Handbook of Partial Least Squares: Concepts, Methods and Applications (Springer Handbooks of Computational Statistics Series), Springer, Heidelberg, Dordrecht, London, New York, NY, pp. 691-711.
Gudergan, S.P., Ringle, C.M., Wende, S. and Will, A. (2008), “Confirmatory tetrad analysis in PLS path modeling”, Journal of Business Research, Vol. 61 No. 12, pp. 1238-1249.
Hahn, C., Johnson, M.D., Herrmann, A. and Huber, F. (2002), “Capturing customer heterogeneity using a finite mixture PLS approach”, Schmalenbach Business Review, Vol. 54 No. 3, pp. 243-269.
Hair, J.F., Hult, G.T.M., Ringle, C.M. and Sarstedt, M. (2017a), A Primer on Partial Least Squares Structural Equation Modeling (PLS-SEM), Sage, Thousand Oaks, CA.
Hair, J.F., Hult, G.T.M., Ringle, C.M., Sarstedt, M. and Thiele, K.O. (2017b), “Mirror, Mirror on the wall: a comparative evaluation of composite-based structural equation modeling methods”, Journal of the Academy of Marketing Science, Vol. 45 No. 5, pp. 616-632.
Hair, J.F., Ringle, C.M. and Sarstedt, M. (2011), “PLS-SEM: indeed a silver bullet”, Journal of Marketing Theory and Practice, Vol. 19 No. 2, pp. 139-151.
Hair, J.F., Ringle, C.M. and Sarstedt, M. (2013), “Partial least squares structural equation modeling: rigorous applications, better results and higher acceptance”, Long Range Planning, Vol. 46 Nos 1/2, pp. 1-12.
Hair, J.F., Sarstedt, M., Hopkins, L. and Kuppelwieser, V.G. (2014), “Partial least squares structural equation modeling (PLS-SEM): an emerging tool in business research”, European Business Review, Vol. 26 No. 2, pp. 106-121.
Hair, J.F., Sarstedt, M., Matthews, L. and Ringle, C.M. (2016), “Identifying and treating unobserved heterogeneity with FIMIX-PLS: part I – method”, European Business Review, Vol. 28 No. 1, pp. 63-76.
Hair, J.F., Sarstedt, M., Pieper, T.M. and Ringle, C.M. (2012a), “The use of partial least squares structural equation modeling in strategic management research: a review of past practices and recommendations for future applications”, Long Range Planning, Vol. 45 Nos 5/6, pp. 320-340.
Hair, J.F., Sarstedt, M. and Ringle, C.M. (2019), “Rethinking some of the rethinking of partial least squares”, European Journal of Marketing, Forthcoming.
Hair, J.F., Sarstedt, M., Ringle, C.M. and Gudergan, S.P. (2018), Advanced Issues in Partial Least Squares Structural Equation Modeling (PLS-SEM), Sage, Thousand Oaks, CA.
Hair, J.F., Sarstedt, M., Ringle, C.M., et al. (2012b), “An assessment of the use of partial least squares structural equation modeling in marketing research”, Journal of the Academy of Marketing Science, Vol. 40 No. 3, pp. 414-433.
Henseler, J., Dijkstra, T.K., Sarstedt, M., Ringle, C.M., Diamantopoulos, A., Straub, D.W., Ketchen, D.J., Hair, J.F., Hult, G.T.M. and Calantone, R.J. (2014), “Common beliefs and reality about partial least squares: comments on Rönkkö and Evermann (2013)”, Organizational Research Methods, Vol. 17 No. 2, pp. 182-209.
Henseler, J. and Fassott, G. (2010), “Testing moderating effects in PLS path models: an illustration of available procedures”, in Esposito Vinzi, V., Chin, WW, Henseler, J., et al. (Eds), Handbook of Partial Least Squares: Concepts, Methods and Applications (Springer Handbooks of Computational Statistics Series), Springer, Heidelberg, Dordrecht, London, New York, NY, Vol. II, pp. 713-735.
Henseler, J., Hubona, G.S. and Ray, P.A. (2016a), “Using PLS path modeling in new technology research: Updated guidelines”, Industrial Management and Data Systems, Vol. 116 No. 1, pp. 1-19.
Henseler, J., Hubona, G.S. and Ray, P.A. (2017), “Partial least squares path modeling: updated guidelines”, in Latan, H. and Noonan, R. (Eds), Partial Least Squares Structural Equation Modeling: Basic Concepts, Methodological Issues and Applications, Springer, Heidelberg, pp. 19-39.
Henseler, J., Ringle, C.M. and Sarstedt, M. (2015), “A new criterion for assessing discriminant validity in variance-based structural equation modeling”, Journal of the Academy of Marketing Science, Vol. 43 No. 1, pp. 115-135.
Henseler, J., Ringle, C.M. and Sarstedt, M. (2016b), “Testing measurement invariance of composites using partial least squares”, International Marketing Review, Vol. 33 No. 3, pp. 405-431.
Henseler, J., Ringle, C.M. and Sinkovics, R.R. (2009), “The use of partial least squares path modeling in international marketing”, in Sinkovics, R.R. and Ghauri, P.N. (Eds) Advances in International Marketing, Emerald, Bingley, pp. 277-320.
Henseler, J. and Sarstedt, M. (2013), “Goodness-of-fit indices for partial least squares path modeling”, Computational Statistics, Vol. 28 No. 2, pp. 565-580.
Houston, M.B. (2004), “Assessing the validity of secondary data proxies for marketing constructs”, Journal of Business Research, Vol. 57 No. 2, pp. 154-161.
Hult, G.T.M., Hair, J.F., Proksch, D., Sarstedt, M., Pinkwart, A. and Ringle, C.M. (2018), “Addressing endogeneity in international marketing applications of partial least squares structural equation modeling”, Journal of International Marketing, Vol. 26 No. 3, pp. 1-21.
Ittner, C.D., Larcker, D.F. and Rajan, M.V. (1997), “The choice of performance measures in annual bonus contracts”, Accounting Review, Vol. 72 No. 2, pp. 231-255.
Jöreskog, K.G. (1971), “Simultaneous factor analysis in several populations”, Psychometrika, Vol. 36 No. 4, pp. 409-426.
Jöreskog, K.G. (1973), “A general method for estimating a linear structural equation system”, In: Goldberger, A.S. and Duncan, O.D. (Eds), Structural Equation Models in the Social Sciences, Seminar Press, New York, NY, pp. 255-284.
Kaufmann, L. and Gaeckler, J. (2015), “A structured review of partial least squares in supply chain management research”, Journal of Purchasing and Supply Management, Vol. 21 No. 4, pp. 259-272.
Khan, G.F., Sarstedt, M., Shiau, W.-L., Hair, J.F., Ringle, C.M. and Fritze, M. (2019), “Methodological research on partial least squares structural equation modeling (PLS-SEM): an analysis based on social network approaches”, Internet Research, Forthcoming.
Kock, N. and Hadaya, P. (2018), “Minimum sample size estimation in PLS-SEM: the inverse square root and gamma-exponential methods”, Information Systems Journal, Vol. 28 No. 1, pp. 227-261.
Latan, H. (2018), “PLS path modeling in hospitality and tourism research: the golden age and days of future Past”, in Ali, F., Rasoolimanesh, S.M. and Cobanoglu, C. (Eds), Applying Partial Least Squares in Tourism and Hospitality Research, Emerald, Bingley, pp. 53-84.
Lohmöller, J.-B. (1989), Latent Variable Path Modeling with Partial Least Squares, Physica, Heidelberg.
Marcoulides, G.A. and Chin, W.W. (2013), “You write, but others read: common methodological misunderstandings in PLS and related methods”, in Abdi, H., Chin, W.W., Esposito Vinzi, V., et al. (Eds), New Perspectives in Partial Least Squares and Related Methods, Springer, New York, NY, pp. 31-64.
Marcoulides, G.A., Chin, W.W. and Saunders, C. (2009), “Foreword: a critical look at partial least squares modeling”, MIS Quarterly, Vol. 33 No. 1, pp. 171-175.
Marcoulides, G.A., Chin, W.W. and Saunders, C. (2012), “When imprecise statistical statements become problematic: a response to Goodhue, Lewis, and Thompson”, MIS Quarterly, Vol. 36 No. 3, pp. 717-728.
Marcoulides, G.A. and Saunders, C. (2006), “PLS: a silver bullet?”, MIS Quarterly, Vol. 30 No. 2, pp. III-IIX.
Mason, C.H. and Perreault, W.D. (1991), “Collinearity, power, and interpretation of multiple regression analysis”, Journal of Marketing Research, Vol. 28 No. 3, pp. 268-280.
Mateos-Aparicio, G. (2011), “Partial least squares (PLS) methods: origins, evolution, and application to social sciences”, Communications in Statistics – Theory and Methods, Vol. 40 No. 13, pp. 2305-2317.
Matthews, L. (2017), “Applying Multi-group analysis in PLS-SEM: a step-by-step process”, in Latan, H. and Noonan, R. (Eds), Partial Least Squares Structural Equation Modeling: Basic Concepts, Methodological Issues and Applications, Springer, Heidelberg, pp. 219-243.
Matthews, L., Sarstedt, M., Hair, J.F. and Ringle, C.M. (2016), “Identifying and treating unobserved heterogeneity with FIMIX-PLS: part II – a case study”, European Business Review, Vol. 28 No. 2, pp. 208-224.
Monecke, A. and Leisch, F. (2012), “semPLS: structural equation modeling using partial least squares”, Journal of Statistical Software, Vol. 48 No. 3, pp. 1-32.
Nitzl, C. (2016), “The use of partial least squares structural equation modelling (PLS-SEM) in management accounting research: Directions for future theory development”, Journal of Accounting Literature, Vol. 37 No. December, pp. 19-35.
Nitzl, C., Roldán, J.L. and Cepeda, C.G. (2016), “Mediation analysis in partial least squares path modeling: Helping researchers discuss more sophisticated models”, Industrial Management and Data Systems, Vol. 119 No. 9, pp. 1849-1864.
Olsson, U.H., Foss, T., Troye, S.V. and Howell, R.D. (2000), “The performance of ML, GLS, and WLS estimation in structural equation modeling under conditions of misspecification and nonnormality”, Structural Equation Modeling: A Multidisciplinary Journal, Vol. 7 No. 4, pp. 557-595.
Park, S. and Gupta, S. (2012), “Handling endogenous regressors by joint estimation using copulas”, Marketing Science, Vol. 31 No. 4, pp. 567-586.
Peng, D.X. and Lai, F. (2012), “Using partial least squares in operations management research: a practical guideline and summary of past research”, Journal of Operations Management, Vol. 30 No. 6, pp. 467-480.
Petter, S. (2018), “Haters gonna hate”: PLS and information systems research”, ACM SIGMIS Database: The DATABASE for Advances in Information Systems, Vol. 49 No. 2, pp. 10-13.
Raithel, S., Sarstedt, M., Scharf, S. and Schwaiger, M. (2012), “On the value relevance of customer satisfaction. Multiple drivers and multiple markets”, Journal of the Academy of Marketing Science, Vol. 40 No. 4, pp. 509-525.
Ramayah, T., Cheah, J.-H., Chuah, F., Ting, H. and Memon, M.A. (2016), Partial Least Squares Structural Equation Modeling (PLS-SEM) Using SmartPLS 3.0: An Updated and Practical Guide to Statistical Analysis, Pearson, Singapore.
Ramsey, J.B. (1969), “Tests for specification errors in classical linear least-squares regression analysis”, Journal of the Royal Statistical Society. Series B (Methodological), Vol. 31 No. 2, pp. 350-371.
Rasoolimanesh, S.M. and Ali, F. (2018), “Editorial: partial least squares (PLS) in hospitality and tourism research”, Journal of Hospitality and Tourism Technology, Vol. 9 No. 3, pp. 238-248.
Reinartz, W.J., Haenlein, M. and Henseler, J. (2009), “An empirical comparison of the efficacy of covariance-based and variance-based SEM”, International Journal of Research in Marketing, Vol. 26 No. 4, pp. 332-344.
Richter, N.F., Cepeda Carrión, G., Roldán, J.L. and Ringle, C.M. (2016), “European management research using partial least squares structural equation modeling (PLS-SEM): editorial”, European Management Journal, Vol. 34 No. 6, pp. 589-597.
Richter, N.F., Sinkovics, R.R., Ringle, C.M. and Schlägel, C.M. (2015), “A critical look at the use of SEM in international business research”, International Marketing Review, Vol. 33 No. 3, pp. 376-404.
Rigdon, E.E. (2012), “Rethinking partial least squares path modeling: in praise of simple methods”, Long Range Planning, Vol. 45 Nos 5/6, pp. 341-358.
Rigdon, E.E. (2013), “Partial least squares path modeling”, in Hancock, G.R. and Mueller, R.O. (Eds), Structural Equation Modeling. A Second Course, 2 ed. Information Age Publishing, Charlotte NC, pp. 81-116.
Rigdon, E.E. (2014a), “Comment on improper use of endogenous formative variables”, Journal of Business Research, Vol. 67 No. 1, pp. 2800-2802.
Rigdon, E.E. (2014b), “Rethinking partial least squares path modeling: breaking chains and forging ahead”, Long Range Planning, Vol. 47 No. 3, pp. 161-167.
Rigdon, E.E. (2016), “Choosing PLS path modeling as analytical method in european management research: a realist perspective”, European Management Journal, Vol. 34 No. 6, pp. 598-605.
Rigdon, E.E., Sarstedt, M. and Ringle, C.M. (2017), “On comparing results from CB-SEM and PLS-SEM. Five perspectives and five recommendations”, Marketing Zfp, Vol. 39 No. 3, pp. 4-16.
Ringle, C.M. and Sarstedt, M. (2016), “Gain more insight from your PLS-SEM results: the Importance-Performance map analysis”, Industrial Management and Data Systems, Vol. 116 No. 9, pp. 1865-1886.
Ringle, C.M., Sarstedt, M., Mitchell, R. and Gudergan, S.P. (2019), “Partial least squares structural equation modeling in HRM research”, The International Journal of Human Resource Management, Forthcoming.
Ringle, C.M., Sarstedt, M. and Mooi, E.A. (2010), “Response-based segmentation using finite mixture partial least squares: theoretical foundations and an application to american customer satisfaction index data”, Annals of Information Systems, Vol. 8, pp. 19-49.
Ringle, C.M., Sarstedt, M. and Straub, D.W. (2012), “A critical look at the use of PLS-SEM in MIS quarterly”, MIS Quarterly, Vol. 36 No. 1, pp. iii-xiv.
Ringle, C.M., Wende, S. and Becker, J.-M. (2015), SmartPLS 3, SmartPLS, Bönningstedt.
Ringle, C.M., Wende, S. and Will, A. (2005), SmartPLS 2, SmartPLS, Hamburg.
Roldán, J.L. and Sánchez-Franco, M.J. (2012), “Variance-based structural equation modeling: guidelines for using partial least squares in information systems research”, in Mora, M., Gelman, O., Steenkamp, AL, et al. (Eds), Research Methodologies, Innovations and Philosophies in Software Systems Engineering and Information Systems, IGI Global, Hershey, PA, pp. 193-221.
Sarstedt, M. and Mooi, E.A. (2019), A Concise Guide to Market Research: The Process, Data, and Methods Using IBM SPSS Statistics, Springer, Heidelberg.
Sarstedt, M., Ringle, C.M. and Hair, J.F. (2017a), “Partial least squares structural equation modeling”, in Homburg, C., Klarmann, M. and Vomberg, A. (Eds), Handbook of Market Research, Springer, Heidelberg.
Sarstedt, M., Ringle, C.M. and Hair, J.F. (2017b), “Treating unobserved heterogeneity in PLS-SEM: a multi-method approach”, in Noonan, R. and Latan, H. (Eds), Partial Least Squares Structural Equation Modeling: Basic Concepts, Methodological Issues and Applications, Springer International Publishing, Cham, pp. 197-217.
Sarstedt, M., Becker, J.-M., Ringle, C.M. and Schwaiger, M. (2011), “Uncovering and treating unobserved heterogeneity with FIMIX-PLS: which model selection criterion provides an appropriate number of segments?”, Schmalenbach Business Review, Vol. 63 No. 1, pp. 34-62.
Sarstedt, M., Bengart, P., Shaltoni, A.M. and Lehmann, S. (2018), “The use of sampling methods in advertising research: A gap between theory and practice”, International Journal of Advertising, Vol. 37 No. 4, pp. 650-663.
Sarstedt, M., Diamantopoulos, A., Salzberger, T. and Baumgartner, P. (2016a), “Selecting single items to measure doubly-concrete constructs: a cautionary tale”, Journal of Business Research, Vol. 69 No. 8, pp. 3159-3167.
Sarstedt, M., Ringle, C.M., Henseler, J. and Hair, J.F. (2014), “On the emancipation of PLS-SEM: a commentary on Rigdon (2012)”, Long Range Planning, Vol. 47 No. 3, pp. 154-160.
Sarstedt, M., Hair, J.F., Ringle, C.M., Thiele, K.O. and Gudergan, S.P. (2016b), “Estimation issues with PLS and CBSEM: where the bias lies!”, Journal of Business Research, Vol. 69 No. 10, pp. 3998-4010.
Sarstedt, M., Ringle, C.M., Cheah, J.-H., Ting, H., Moisescu, O.I. and Radomir, L. (2019), Structural model robustness checks in PLS-SEM, Tourism Economics, Forthcoming.
Schwarz, G. (1978), “Estimating the dimensions of a model”, The Annals of Statistics, Vol. 6 No. 2, pp. 461-464.
Sharma, P.N., Sarstedt, M., Shmueli, G., Kim, K.H. and Thiele, K.O. (2019a), “PLS-Based model selection: The role of alternative explanations in information systems research”, Journal of the Association for Information Systems, Forthcoming.
Sharma, P.N., Shmueli, G., Sarstedt, M., Danks, S. and Ray, N. (2019b), “Prediction-oriented model selection in partial least squares path modeling”, Decision Sciences, Forthcoming.
Shiau, W.-L., Sarstedt, M. and Hair, J.F. (2019), “Editorial: internet research using Partial Least squares Structural equation modeling (PLS-SEM)”, Internet Research, Forthcoming.
Shmueli, G. (2010), “To explain or to predict?”, Statistical Science, Vol. 25 No. 3, pp. 289-310.
Shmueli, G. and Koppius, O.R. (2011), “Predictive analytics in information systems research”, MIS Quarterly, Vol. 35 No. 3, pp. 553-572.
Shmueli, G., Ray, S., Velasquez Estrada, J.M. and Shatla, S.B. (2016), “The elephant in the room: evaluating the predictive performance of PLS models”, Journal of Business Research, Vol. 69 No. 10, pp. 4552-4564.
Shmueli, G., Sarstedt, M., Hair, J.F., Cheah, J.-H., Ting, H., Vaithilingam, S. and Ringle, C.M. (2019), “Predictive model assessment in PLS-SEM: guidelines for using PLSpredict”, Working Paper.
Sosik, J.J., Kahai, S.S. and Piovoso, M.J. (2009), “Silver bullet or voodoo statistics? A primer for using the partial least squares data analytic technique in Group and Organization Research”, Group and Organization Research. Group and Organization Management, Vol. 34 No. 1, pp. 5-36.
Stone, M. (1974), “Cross-validatory choice and assessment of statistical predictions”, Journal of the Royal Statistical Society, Vol. 36 No. 2, pp. 111-147.
Svensson, G., Ferro, C., Høgevold, N., Padin, C., Sosa Varela, J.C. and Sarstedt, M. (2018), “Framing the triple bottom line approach: direct and mediation effects between economic, social and environmental elements”, Journal of Cleaner Production, Vol. 197, pp. 972-991.
Tenenhaus, M., Esposito Vinzi, V., Chatelin, Y.-M. and Lauro, C. (2005), “PLS path modeling”, Computational Statistics and Data Analysis, Vol. 48 No. 1, pp. 159-205.
Voorhees, C.M., Brady, M.K., Calantone, R. and Ramirez, E. (2016), “Discriminant validity testing in marketing: an analysis, causes for concern, and proposed remedies”, Journal of the Academy of Marketing Science, Vol. 44 No. 1, pp. 119-134.
Westland, J.C. (2015), “Partial least squares path analysis. Structural equation models: from paths to networks”, Springer International Publishing, Cham, pp. 23-46.
Willaby, H.W., Costa, D.S.J., Burns, B.D., MacCann, C. and Roberts, R.D. (2015), “Testing complex models with small sample sizes: a historical overview and empirical demonstration of what partial least squares (PLS) can offer differential psychology”, Personality and Individual Differences, Vol. 84, pp. 73-78.
Wold, H.O.A. (1975), “Path models with latent variables: The NIPALS approach”, in Blalock, H.M., Aganbegian, A., Borodkin, F.M., et al. (Eds), Quantitative Sociology: International Perspectives on Mathematical and Statistical Modeling, New York, NY, Academic Press, pp. 307-357.
Wold, H.O.A. (1982), “Soft modeling: the basic design and some extensions”, in Jöreskog, K.G. and Wold, H.O.A. (Eds), Systems under Indirect Observations: Part II, North-Holland, Amsterdam, pp. 1-54.
Wold, H.O.A. (1985), “Partial least squares”, in Kotz, S. and Johnson, N.L. (Eds), Encyclopedia of Statistical Sciences, Wiley, New York, NY, pp. 581-591.