4.4 Article

A Tutorial on Assessing Statistical Power and Determining Sample Size for Structural Equation Models

Journal

PSYCHOLOGICAL METHODS
Volume 28, Issue 1, Pages 207-221

Publisher

AMER PSYCHOLOGICAL ASSOC
DOI: 10.1037/met0000423

Keywords

structural equation modeling; statistical power; sample size planning; goodness of fit

Ask authors/readers for more resources

Structural equation modeling (SEM) is commonly used in psychology and other social sciences to test empirical hypotheses. However, most studies involving SEM do not conduct power analysis for sample size planning or evaluate the achieved power of the tests. This tutorial provides step-by-step instructions on a priori, post hoc, and compromise power analyses for various SEM applications. It emphasizes the importance of thoughtful sample size planning to ensure reliable and replicable results, particularly when small or medium-sized effects are expected.
Translational Abstract Structural equation modeling (SEM) is a widespread approach to test substantive hypotheses in psychology and other social sciences. Whenever hypothesis tests are performed, researchers should ensure that the sample size is sufficiently large to detect the hypothesized effect. Power analyses can be used to determine the required sample size to identify the effect of interest with a desired level of statistical power (i.e., the probability to reject an incorrect null hypothesis). Vice versa, power analyses can also be used to determine the achieved power of a test, given an effect and a particular sample size. However, most studies involving SEM neither conduct a power analysis to inform sample size planning nor evaluate the achieved power of the performed tests. In this tutorial, we show and illustrate how power analyses can be used to identify the required sample size to detect a certain effect of interest or to determine the probability of a conducted test to detect a certain effect. These analyses are exemplified regarding the overall model as well as regarding individual model parameters, whereby both, models referring to a single group as well as models assessing differences between multiple groups are considered. Structural equation modeling (SEM) is a widespread approach to test substantive hypotheses in psychology and other social sciences. However, most studies involving structural equation models neither report statistical power analysis as a criterion for sample size planning nor evaluate the achieved power of the performed tests. In this tutorial, we provide a step-by-step illustration of how a priori, post hoc, and compromise power analyses can be conducted for a range of different SEM applications. Using illustrative examples and the R package semPower, we demonstrate power analyses for hypotheses regarding overall model fit, global model comparisons, particular individual model parameters, and differences in multigroup contexts (such as in tests of measurement invariance). We encourage researchers to yield reliable-and thus more replicable-results based on thoughtful sample size planning, especially if small or medium-sized effects are expected.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.4
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

Article Mathematics, Interdisciplinary Applications

A comparison of correlation and regression approaches for multinomial processing tree models

Lisa J. Jobst, Daniel W. Heck, Morten Moshagen

JOURNAL OF MATHEMATICAL PSYCHOLOGY (2020)

Article Psychology, Educational

The Effect of Latent and Error Non-Normality on Measures of Fit in Structural Equation Modeling

Lisa J. Jobst, Max Auerswald, Morten Moshagen

Summary: A study was conducted to investigate the effects of non-normality in structural equation modeling by manipulating the multivariate distribution in a Monte Carlo simulation. Results showed that all measures of fit were influenced by the source of non-normality, but with varying patterns depending on the estimation methods used.

EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT (2022)

Article Mathematics, Interdisciplinary Applications

Effects of Multivariate Non-Normality and Missing Data on the Root Mean Square Error of Approximation

Lisa J. Jobst, Christoph Heine, Max Auerswald, Morten Moshagen

Summary: The study analyzed the impact of different corrections on the root mean square error of approximation in structural equation modeling, finding that corrected values exhibit a stronger bias in non-normality and the extent of bias is also influenced by properties of the multivariate distribution.

STRUCTURAL EQUATION MODELING-A MULTIDISCIPLINARY JOURNAL (2021)

Article Psychology, Clinical

Themes of the Dark Core of Personality

Martina Bader, Johanna Hartung, Benjamin E. Hilbig, Ingo Zettler, Morten Moshagen, Oliver Wilhelm

Summary: This study examines the internal structure of the Dark Factor of Personality (D), finding that D consists of five specific factors - Callousness, Deceitfulness, Narcissistic Entitlement, Sadism, and Vindictiveness - which best describe the internal structure of D and its relation to aversive traits.

PSYCHOLOGICAL ASSESSMENT (2021)

Article Psychology, Clinical

Measuring the Dark Core of Personality in German: Psychometric Properties, Measurement Invariance, Predictive Validity, and Self-Other Agreement

Martina Bader, Luisa K. Horsten, Benjamin E. Hilbig, Ingo Zettler, Morten Moshagen

Summary: This study comprehensively evaluated the German version of D70 and its shorter versions, confirming their reliability and validity for psychometric assessment, with moderate self-other agreement.

JOURNAL OF PERSONALITY ASSESSMENT (2022)

Article Mathematics, Interdisciplinary Applications

Sample Size Requirements for Bifactor Models

Martina Bader, Lisa J. Jobst, Morten Moshagen

Summary: Despite the application of bifactor models, little research has considered sample sizes required for this type of model. In this study, we illustrate how to determine sample size requirements for bifactor models using Monte Carlo simulations in R. Results show that a sample size of 500 is often sufficient, but exact requirements depend on various model characteristics.

STRUCTURAL EQUATION MODELING-A MULTIDISCIPLINARY JOURNAL (2022)

Article Psychology, Multidisciplinary

Assessing the Fitting Propensity of Factor Models

Martina Bader, Morten Moshagen

Summary: Model selection is a common issue in structural equation modeling (SEM), and a trade-off between goodness-of-fit and model parsimony is often sought. This investigation assessed the fitting propensity of frequently used SEM models and evaluated the performance of fit indices and information criteria in controlling for fitting propensity. The results showed that differences in fitting propensity were mostly driven by the number of free parameters, and fit indices adjusting for the number of parameters adequately accounted for these differences.

PSYCHOLOGICAL METHODS (2022)

Article Psychology, Social

Rethinking aversive personality: Decomposing the Dark Triad traits into their common core and unique flavors

Martina Bader, Benjamin E. Hilbig, Ingo Zettler, Morten Moshagen

Summary: This study proposes a new theoretical view that conceptualizes the Dark Triad traits as specific manifestations of the common core of aversive traits, flavored by unique, essentially non-aversive characteristics. The empirical findings from two studies support this view and reveal a discrepancy between the current conceptualization and empirical structure of the Dark Triad traits.

JOURNAL OF PERSONALITY (2023)

Article Psychology, Clinical

Disentangling the Effects of Culture and Language on Measurement Noninvariance in Cross-Cultural Research: The Culture, Comprehension, and Translation Bias (CCT) Procedure

Martina Bader, Lisa J. Jobst, Ingo Zettler, Benjamin E. Hilbig, Morten Moshagen

Summary: Comparability of measurement across different cultural groups is crucial for cross-cultural assessment, but achieving cross-cultural measurement invariance can be challenging. Noninvariance in measurement may stem from translation bias, culture bias, or comprehension bias, depending on the language version used. The Culture, Comprehension, and Translation Bias (CCT) procedure outlines a method to distinguish these sources of item noninvariance and improve the accuracy of cross-cultural assessment through multiple pairwise comparisons.

PSYCHOLOGICAL ASSESSMENT (2021)

No Data Available