Skip to contents

Basic CFA Model

We begin with a simple example of confirmatory factor analysis (CFA), using theminorbsem() function to fit the model. The minorbsem package contains a built-in dataset called HS, which is a part of classic Holzinger-Swineford dataset. This dataset is used in many papers and books on CFA. The dataset consists of mental ability test scores of seventh and eighth grade children from two different schools (Pasteur and Grant-White). In our version of the dataset (obtained from lavaan, Rosseel (2012)), only 9 out of the original 26 tests are included.

We begin by loading the package:

## 
## ###############################################################################
## This is minorbsem 0.2.14
## 
## All users of R (or SEM) are invited to report bugs, submit functions or ideas
## for functions. An efficient way to do this is to open an issue on GitHub
## https://github.com/jamesuanhoro/minorbsem/issues/.
## ###############################################################################

The first six lines of the dataset:

head(HS)
##   id sex ageyr agemo  school grade       x1   x2    x3       x4   x5        x6       x7   x8
## 1  1   1    13     1 Pasteur     7 3.333333 7.75 0.375 2.333333 5.75 1.2857143 3.391304 5.75
## 2  2   2    13     7 Pasteur     7 5.333333 5.25 2.125 1.666667 3.00 1.2857143 3.782609 6.25
## 3  3   2    13     1 Pasteur     7 4.500000 5.25 1.875 1.000000 1.75 0.4285714 3.260870 3.90
## 4  4   1    13     2 Pasteur     7 5.333333 7.75 3.000 2.666667 4.50 2.4285714 3.000000 5.30
## 5  5   2    12     2 Pasteur     7 4.833333 4.75 0.875 2.666667 4.00 2.5714286 3.695652 6.30
## 6  6   2    14     1 Pasteur     7 5.333333 5.00 2.250 1.000000 3.00 0.8571429 4.347826 6.65
##         x9
## 1 6.361111
## 2 7.916667
## 3 4.416667
## 4 4.861111
## 5 5.916667
## 6 7.500000

Data preparation

The scale of data is important for setting priors on model parameters. The default priors for models fit with minorbsem are reasonable when variables have standard deviations close to 1. For this reason, we first check the standard deviations of the relevant variables for this analysis:

item_data <- HS[, paste0("x", 1:9)] # select columns x1:x9
head(item_data) # show first six rows
##         x1   x2    x3       x4   x5        x6       x7   x8       x9
## 1 3.333333 7.75 0.375 2.333333 5.75 1.2857143 3.391304 5.75 6.361111
## 2 5.333333 5.25 2.125 1.666667 3.00 1.2857143 3.782609 6.25 7.916667
## 3 4.500000 5.25 1.875 1.000000 1.75 0.4285714 3.260870 3.90 4.416667
## 4 5.333333 7.75 3.000 2.666667 4.50 2.4285714 3.000000 5.30 4.861111
## 5 4.833333 4.75 0.875 2.666667 4.00 2.5714286 3.695652 6.30 5.916667
## 6 5.333333 5.00 2.250 1.000000 3.00 0.8571429 4.347826 6.65 7.500000
apply(item_data, 2, sd) # compute SD of each variable
##       x1       x2       x3       x4       x5       x6       x7       x8       x9 
## 1.167432 1.177451 1.130979 1.164116 1.290472 1.095603 1.089534 1.012615 1.009152

All variables have standard deviations close to 1, so we can move forward with the data as they are. Otherwise, we would recommend re-scaling the variables.1

Model syntax

The model syntax is lavaan-style:

syntax_basic <- "
Visual =~ x1 + x2 + x3
Verbal =~ x4 + x5 + x6
Speed =~ x7 + x8 + x9"

This model assumes three latent variables (or factors): visual, verbal, and speed. The visual factor has the indicators: x1, x2, and x3; the verbal factor has indicators: x4, x5, and x6; the speed factor has indicators: x7, x8, and x9.

Fit the model

We run the analysis using the minorbsem() function. By default, the function assumes that minor factors influence the covariance between the variables. minorbsem then prints out the iterations from Stan – we show these iterations once so the reader knows what to expect.

fit_cfa <- minorbsem(model = syntax_basic, data = HS)
## Processing user input ...
## User input fully processed :)
##  Now to modeling.
## Fitting Stan model ...
## Init values were only set for a subset of parameters. 
## Missing init values for the following parameters:
##  - chain 1: gdp_alpha, loadings, res_sds_u, phi_mat_chol, res_cor_01, coefs, sigma_loadings_complex, gdp_loadings_complex, Sigma
##  - chain 2: gdp_alpha, loadings, res_sds_u, phi_mat_chol, res_cor_01, coefs, sigma_loadings_complex, gdp_loadings_complex, Sigma
##  - chain 3: gdp_alpha, loadings, res_sds_u, phi_mat_chol, res_cor_01, coefs, sigma_loadings_complex, gdp_loadings_complex, Sigma
## Running MCMC with 3 chains, at most 2 in parallel...
## 
## Chain 1 Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 1 Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
## Chain 1 Exception: lkj_corr_cholesky_lpdf: Random variable[2] is 0, but must be positive! (in '/tmp/RtmpO4Ubya/model-22a964aae8b9.stan', line 324, column 2 to column 48)
## Chain 1 If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
## Chain 1 but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.
## Chain 1
## Chain 1 Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
## Chain 1 Exception: lkj_corr_cholesky_lpdf: Random variable[2] is 0, but must be positive! (in '/tmp/RtmpO4Ubya/model-22a964aae8b9.stan', line 324, column 2 to column 48)
## Chain 1 If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
## Chain 1 but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.
## Chain 1
## Chain 1 Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
## Chain 1 Exception: lkj_corr_cholesky_lpdf: Random variable[2] is 0, but must be positive! (in '/tmp/RtmpO4Ubya/model-22a964aae8b9.stan', line 324, column 2 to column 48)
## Chain 1 If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
## Chain 1 but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.
## Chain 1
## Chain 1 Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
## Chain 1 Exception: cholesky_decompose: A is not symmetric. A[4,7] = inf, but A[7,4] = inf (in '/tmp/RtmpO4Ubya/model-22a964aae8b9.stan', line 432, column 6 to column 81)
## Chain 1 If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
## Chain 1 but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.
## Chain 1
## Chain 1 Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
## Chain 1 Exception: cholesky_decompose: Matrix m is not positive definite (in '/tmp/RtmpO4Ubya/model-22a964aae8b9.stan', line 432, column 6 to column 81)
## Chain 1 If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
## Chain 1 but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.
## Chain 1
## Chain 1 Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
## Chain 1 Exception: lkj_corr_cholesky_lpdf: Random variable[2] is 0, but must be positive! (in '/tmp/RtmpO4Ubya/model-22a964aae8b9.stan', line 324, column 2 to column 48)
## Chain 1 If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
## Chain 1 but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.
## Chain 1
## Chain 1 Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
## Chain 1 Exception: lkj_corr_cholesky_lpdf: Random variable[2] is 0, but must be positive! (in '/tmp/RtmpO4Ubya/model-22a964aae8b9.stan', line 324, column 2 to column 48)
## Chain 1 If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
## Chain 1 but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.
## Chain 1
## Chain 2 Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 2 Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
## Chain 2 Exception: cholesky_decompose: A is not symmetric. A[1,2] = inf, but A[2,1] = inf (in '/tmp/RtmpO4Ubya/model-22a964aae8b9.stan', line 432, column 6 to column 81)
## Chain 2 If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
## Chain 2 but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.
## Chain 2
## Chain 2 Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
## Chain 2 Exception: cholesky_decompose: A is not symmetric. A[1,2] = -inf, but A[2,1] = -inf (in '/tmp/RtmpO4Ubya/model-22a964aae8b9.stan', line 432, column 6 to column 81)
## Chain 2 If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
## Chain 2 but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.
## Chain 2
## Chain 2 Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
## Chain 2 Exception: cholesky_decompose: A is not symmetric. A[1,2] = -inf, but A[2,1] = -inf (in '/tmp/RtmpO4Ubya/model-22a964aae8b9.stan', line 432, column 6 to column 81)
## Chain 2 If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
## Chain 2 but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.
## Chain 2
## Chain 2 Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
## Chain 2 Exception: cholesky_decompose: Matrix m is not positive definite (in '/tmp/RtmpO4Ubya/model-22a964aae8b9.stan', line 432, column 6 to column 81)
## Chain 2 If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
## Chain 2 but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.
## Chain 2
## Chain 2 Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
## Chain 2 Exception: lkj_corr_cholesky_lpdf: Random variable[2] is 0, but must be positive! (in '/tmp/RtmpO4Ubya/model-22a964aae8b9.stan', line 324, column 2 to column 48)
## Chain 2 If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
## Chain 2 but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.
## Chain 2
## Chain 2 Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
## Chain 2 Exception: cholesky_decompose: A is not symmetric. A[1,2] = inf, but A[2,1] = inf (in '/tmp/RtmpO4Ubya/model-22a964aae8b9.stan', line 432, column 6 to column 81)
## Chain 2 If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
## Chain 2 but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.
## Chain 2
## Chain 1 Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 2 Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
## Chain 2 Exception: cholesky_decompose: Matrix m is not positive definite (in '/tmp/RtmpO4Ubya/model-22a964aae8b9.stan', line 432, column 6 to column 81)
## Chain 2 If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
## Chain 2 but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.
## Chain 2
## Chain 1 Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
## Chain 1 Exception: cholesky_decompose: Matrix m is not positive definite (in '/tmp/RtmpO4Ubya/model-22a964aae8b9.stan', line 432, column 6 to column 81)
## Chain 1 If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
## Chain 1 but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.
## Chain 1
## Chain 2 Iteration:  200 / 2000 [ 10%]  (Warmup) 
## Chain 1 Iteration:  400 / 2000 [ 20%]  (Warmup) 
## Chain 2 Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 2 Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
## Chain 2 Exception: cholesky_decompose: Matrix m is not positive definite (in '/tmp/RtmpO4Ubya/model-22a964aae8b9.stan', line 432, column 6 to column 81)
## Chain 2 If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
## Chain 2 but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.
## Chain 2
## Chain 1 Iteration:  600 / 2000 [ 30%]  (Warmup) 
## Chain 1 Iteration:  800 / 2000 [ 40%]  (Warmup) 
## Chain 2 Iteration:  600 / 2000 [ 30%]  (Warmup) 
## Chain 2 Iteration:  800 / 2000 [ 40%]  (Warmup) 
## Chain 1 Iteration: 1000 / 2000 [ 50%]  (Warmup) 
## Chain 1 Iteration: 1001 / 2000 [ 50%]  (Sampling) 
## Chain 1 Iteration: 1200 / 2000 [ 60%]  (Sampling) 
## Chain 2 Iteration: 1000 / 2000 [ 50%]  (Warmup) 
## Chain 2 Iteration: 1001 / 2000 [ 50%]  (Sampling) 
## Chain 2 Iteration: 1200 / 2000 [ 60%]  (Sampling) 
## Chain 1 Iteration: 1400 / 2000 [ 70%]  (Sampling) 
## Chain 2 Iteration: 1400 / 2000 [ 70%]  (Sampling) 
## Chain 1 Iteration: 1600 / 2000 [ 80%]  (Sampling) 
## Chain 2 Iteration: 1600 / 2000 [ 80%]  (Sampling) 
## Chain 1 Iteration: 1800 / 2000 [ 90%]  (Sampling) 
## Chain 2 Iteration: 1800 / 2000 [ 90%]  (Sampling) 
## Chain 1 Iteration: 2000 / 2000 [100%]  (Sampling) 
## Chain 1 finished in 2.8 seconds.
## Chain 2 Iteration: 2000 / 2000 [100%]  (Sampling) 
## Chain 3 Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 3 Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
## Chain 3 Exception: lkj_corr_cholesky_lpdf: Random variable[2] is 0, but must be positive! (in '/tmp/RtmpO4Ubya/model-22a964aae8b9.stan', line 324, column 2 to column 48)
## Chain 3 If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
## Chain 3 but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.
## Chain 3
## Chain 3 Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
## Chain 3 Exception: lkj_corr_cholesky_lpdf: Random variable[2] is 0, but must be positive! (in '/tmp/RtmpO4Ubya/model-22a964aae8b9.stan', line 324, column 2 to column 48)
## Chain 3 If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
## Chain 3 but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.
## Chain 3
## Chain 3 Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
## Chain 3 Exception: lkj_corr_cholesky_lpdf: Random variable[2] is 0, but must be positive! (in '/tmp/RtmpO4Ubya/model-22a964aae8b9.stan', line 324, column 2 to column 48)
## Chain 3 If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
## Chain 3 but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.
## Chain 3
## Chain 3 Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
## Chain 3 Exception: lkj_corr_cholesky_lpdf: Random variable[2] is 0, but must be positive! (in '/tmp/RtmpO4Ubya/model-22a964aae8b9.stan', line 324, column 2 to column 48)
## Chain 3 If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
## Chain 3 but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.
## Chain 3
## Chain 3 Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
## Chain 3 Exception: lkj_corr_cholesky_lpdf: Random variable[2] is 0, but must be positive! (in '/tmp/RtmpO4Ubya/model-22a964aae8b9.stan', line 324, column 2 to column 48)
## Chain 3 If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
## Chain 3 but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.
## Chain 3
## Chain 2 finished in 3.0 seconds.
## Chain 3 Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 3 Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
## Chain 3 Exception: cholesky_decompose: Matrix m is not positive definite (in '/tmp/RtmpO4Ubya/model-22a964aae8b9.stan', line 432, column 6 to column 81)
## Chain 3 If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
## Chain 3 but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.
## Chain 3
## Chain 3 Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 3 Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
## Chain 3 Exception: cholesky_decompose: Matrix m is not positive definite (in '/tmp/RtmpO4Ubya/model-22a964aae8b9.stan', line 432, column 6 to column 81)
## Chain 3 If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
## Chain 3 but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.
## Chain 3
## Chain 3 Iteration:  600 / 2000 [ 30%]  (Warmup) 
## Chain 3 Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 3 Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
## Chain 3 Exception: cholesky_decompose: Matrix m is not positive definite (in '/tmp/RtmpO4Ubya/model-22a964aae8b9.stan', line 432, column 6 to column 81)
## Chain 3 If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
## Chain 3 but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.
## Chain 3
## Chain 3 Iteration: 1000 / 2000 [ 50%]  (Warmup) 
## Chain 3 Iteration: 1001 / 2000 [ 50%]  (Sampling) 
## Chain 3 Iteration: 1200 / 2000 [ 60%]  (Sampling) 
## Chain 3 Iteration: 1400 / 2000 [ 70%]  (Sampling) 
## Chain 3 Iteration: 1600 / 2000 [ 80%]  (Sampling) 
## Chain 3 Iteration: 1800 / 2000 [ 90%]  (Sampling) 
## Chain 3 Iteration: 2000 / 2000 [100%]  (Sampling) 
## Chain 3 finished in 3.5 seconds.
## 
## All 3 chains finished successfully.
## Mean chain execution time: 3.1 seconds.
## Total execution time: 6.5 seconds.
## 
##                     Parameter estimates (method = normal, sample size(s) = 301)                     
##              from     op   to        mean      sd   5.000%   95.000%    rhat   ess_bulk  
##            ──────────────────────────────────────────────────────────────────────────────
##              Goodness of fit                                                             
##            ──────────────────────────────────────────────────────────────────────────────
##              PPP                    0.417                              1.001       1668  
##              RMSE                   0.064   0.014    0.043     0.090   1.002        711  
##            ──────────────────────────────────────────────────────────────────────────────
##              Factor loadings                                                             
##            ──────────────────────────────────────────────────────────────────────────────
##              Visual   =~   x1       0.931   0.126    0.732     1.146   1.002       1121  
##              Visual   =~   x2       0.472   0.106    0.300     0.647   1.004       1797  
##              Visual   =~   x3       0.623   0.104    0.456     0.795   1.003       1500  
##              Verbal   =~   x4       0.991   0.088    0.846     1.139   1.001       1395  
##              Verbal   =~   x5       1.066   0.102    0.901     1.238   1.000       1948  
##              Verbal   =~   x6       0.935   0.090    0.788     1.088   1.001       1588  
##              Speed    =~   x7       0.560   0.102    0.396     0.728   1.001       1815  
##              Speed    =~   x8       0.682   0.109    0.510     0.866   1.001       1486  
##              Speed    =~   x9       0.786   0.116    0.596     0.988   1.001       1246  
##            ──────────────────────────────────────────────────────────────────────────────
##              Inter-factor correlations                                                   
##            ──────────────────────────────────────────────────────────────────────────────
##              Verbal   ~~   Visual   0.428   0.082    0.295     0.568   1.000       1990  
##              Speed    ~~   Visual   0.469   0.096    0.307     0.623   1.001       1812  
##              Speed    ~~   Verbal   0.274   0.078    0.144     0.399   1.000       2406  
##            ──────────────────────────────────────────────────────────────────────────────
##              Residual variances                                                          
##            ──────────────────────────────────────────────────────────────────────────────
##              x1       ~~   x1       0.476   0.219    0.050     0.809   1.005       1026  
##              x2       ~~   x2       1.158   0.124    0.958     1.365   1.001       2456  
##              x3       ~~   x3       0.881   0.127    0.668     1.086   1.004       1460  
##              x4       ~~   x4       0.363   0.145    0.105     0.591   1.001       1096  
##              x5       ~~   x5       0.512   0.180    0.187     0.790   1.000       1462  
##              x6       ~~   x6       0.322   0.138    0.062     0.534   1.003       1278  
##              x7       ~~   x7       0.872   0.114    0.687     1.054   1.001       1954  
##              x8       ~~   x8       0.568   0.141    0.327     0.784   1.001       1352  
##              x9       ~~   x9       0.407   0.168    0.082     0.660   1.001       1103  
##            ──────────────────────────────────────────────────────────────────────────────
##                                                                                          
## 
## Column names: from, op, to, mean, sd, 5%, 95%, rhat, ess_bulk

Output structure

At the top of the results table, method = normal indicates the approach of estimating the residual covariances between all items: the belief is that the standardized residual covariances (SRCs) which reflect minor factor influences are normally distributed with zero mean. The table also prints out the sample size of 301 – only complete rows are retained for analysis.

We describe the column headers. The from, op and to combination describe the type of parameter being reported according to lavaan-style syntax. For example, the Visual =~ x1 row describes the loading from the visual factor to item x1. The mean, sd and percentage columns are descriptive statistics of posterior distributions. The mean and sd function like the estimate and standard error in standard frequentist statistics. The percentage columns are credible intervals. By default, they are 90% credible intervals, i.e. given the prior and data, there is a 90% chance the parameter falls in this interval. rhat (pronounced R-hat) and ess_bulk columns are the potential scale reduction factor (\(\widehat{R}\)) and effective sample size (ESS) respectively (Vehtari et al. 2021) – they are useful for checking parameter convergence. For \(\widehat{R}\), values very close to 1 are preferable. For ESS, larger values are preferable. A final analysis in a manuscript would ideally have all parameters with \(\widehat{R}\) under 1.01 and ESS above 400 for one to be sure parameter estimates have converged (Vehtari et al. 2021). An easy way to meet these expectations is to increase the number of requested samples when calling minorbsem() via the warmup = and sampling = arguments, see ?minorbsem.

The parameter estimates are presented by the type of parameter.

Goodness of fit

PPP. The first section of results contains parameters that help assess global model fit. “PPP” is the posterior predictive p-value in the form described by Muthén and Asparouhov (2012), and is akin to a \(\chi^2\) test in standard SEMs. It is conventional to prefer values under .05 or .10. Here, PPP = .382 indicating a good-fitting model. Desirable PPP-values are to be expected by default in minorbsem as the package accounts for model misspecification – alternatively stated: PPP-values above .05 do not imply an absence of misfit and is not all that informative by default. We report PPP since minorbsem is also able to fit Bayesian SEMs that do not account for misspecification, e.g. minorbsem(..., method = "none").

RMSE. This the root mean square error of standardized residual covariances (SRCs) and communicates the typical size of SRCs. One may also interpret this metric as the standard deviation of SRCs with 95% of SRCs lying within 2 RMSE values from 0. In this example, RMSE = 0.063 and we can expect some SRCs to be greater than 0.10, suggesting some large SRCs (Maydeu-Olivares 2017). Large SRCs challenge the notion that model misspecification is due to the influence of minor factors – if these influences are large, are these factors “minor”? It is possible that the hypothesized structure is incorrect, or minor factors have significant effects.

Substantive parameters

The parameter estimates are reported by type of parameter: factor loadings, inter-factor correlations, and error variances. For this model, all items load on their respective factors with intervals that clearly exclude 0. All factors are assumed standardized in minorbsem, so only their correlations are reported; and all factors are non-trivially correlated.

Residual plots

Given that the RMSE suggests large standardized residual covariances (SRCs), we can request a plot of SRCs using two options: a range-plot and a heat-map.

plot_residuals(fit_cfa, type = "range")

plot_residuals(fit_cfa, type = "matrix")

The heat-map is particularly useful for highlighting the largest SRCs. If these SRCs cluster in a non-random way, one may identify potential model modifications.

Bifactor model with orthogonal factors

To improve on the basic model, we consider the bifactor structure, which builds on the basic CFA model by specifying a general factor ‘G’ that is reflected in all nine indicators. All factors are assumed orthogonal.

Model syntax

syntax_bifactor <- paste0(
  "G =~ x1 + x2 + x3 + x4 + x5 + x6 + x7 + x8 + x9", "\n",
  syntax_basic
)
writeLines(syntax_bifactor)
## G =~ x1 + x2 + x3 + x4 + x5 + x6 + x7 + x8 + x9
## 
## Visual =~ x1 + x2 + x3
## Verbal =~ x4 + x5 + x6
## Speed =~ x7 + x8 + x9

Fit the model

The call to minorbsem() needs to be of the form: minorbsem(..., orthogonal = TRUE) to ensure the factors are orthogonal:

fit_bifactor <- minorbsem(syntax_bifactor, data = HS, orthogonal = TRUE)
##                     Parameter estimates (method = normal, sample size(s) = 301)                     
##             from     op   to         mean      sd   5.000%   95.000%    rhat   ess_bulk  
##           ───────────────────────────────────────────────────────────────────────────────
##             Goodness of fit                                                              
##           ───────────────────────────────────────────────────────────────────────────────
##             PPP                     0.403                              1.000       1890  
##             RMSE                    0.027   0.012    0.009     0.049   1.003        604  
##           ───────────────────────────────────────────────────────────────────────────────
##             Factor loadings                                                              
##           ───────────────────────────────────────────────────────────────────────────────
##             G        =~   x1        0.952   0.110    0.765     1.133   1.002       1588  
##             G        =~   x2        0.494   0.105    0.327     0.668   1.001       2061  
##             G        =~   x3        0.630   0.095    0.475     0.781   1.000       1842  
##             G        =~   x4        0.471   0.084    0.340     0.612   1.001       1756  
##             G        =~   x5        0.425   0.095    0.269     0.583   0.999       1863  
##             G        =~   x6        0.460   0.080    0.331     0.596   1.002       1716  
##             G        =~   x7        0.107   0.083   -0.028     0.247   1.001       2525  
##             G        =~   x8        0.288   0.075    0.168     0.414   1.000       2332  
##             G        =~   x9        0.509   0.077    0.389     0.642   1.000       2072  
##             Visual   =~   x1        0.225   0.179    0.018     0.591   1.000        997  
##             Visual   =~   x2       -0.157   0.516   -0.991     0.773   1.002       1581  
##             Visual   =~   x3       -0.017   0.454   -0.736     0.876   1.003       1546  
##             Verbal   =~   x4        0.860   0.075    0.735     0.980   1.001       1923  
##             Verbal   =~   x5        1.051   0.086    0.907     1.190   1.002       1341  
##             Verbal   =~   x6        0.788   0.070    0.673     0.903   1.000       2032  
##             Speed    =~   x7        0.731   0.118    0.545     0.939   1.002       1518  
##             Speed    =~   x8        0.708   0.105    0.534     0.890   1.002       1434  
##             Speed    =~   x9        0.435   0.079    0.306     0.562   1.002       2381  
##           ───────────────────────────────────────────────────────────────────────────────
##             Inter-factor correlations                                                    
##           ───────────────────────────────────────────────────────────────────────────────
##             Visual   ~~   G         0.000   0.000    0.000     0.000                     
##             Verbal   ~~   G         0.000   0.000    0.000     0.000                     
##             Verbal   ~~   Visual    0.000   0.000    0.000     0.000                     
##             Speed    ~~   G         0.000   0.000    0.000     0.000                     
##             Speed    ~~   Visual    0.000   0.000    0.000     0.000                     
##             Speed    ~~   Verbal    0.000   0.000    0.000     0.000                     
##           ───────────────────────────────────────────────────────────────────────────────
##             Residual variances                                                           
##           ───────────────────────────────────────────────────────────────────────────────
##             x1       ~~   x1        0.395   0.219    0.017     0.726   1.001        790  
##             x2       ~~   x2        0.870   0.357    0.054     1.265   1.002        881  
##             x3       ~~   x3        0.696   0.264    0.083     0.994   1.002        851  
##             x4       ~~   x4        0.392   0.085    0.250     0.530   1.001       1892  
##             x5       ~~   x5        0.385   0.130    0.147     0.584   1.002        921  
##             x6       ~~   x6        0.376   0.074    0.254     0.494   1.000       2027  
##             x7       ~~   x7        0.639   0.159    0.347     0.865   1.004       1367  
##             x8       ~~   x8        0.449   0.131    0.207     0.645   1.003       1368  
##             x9       ~~   x9        0.577   0.075    0.450     0.700   1.003       2451  
##           ───────────────────────────────────────────────────────────────────────────────
##                                                                                          
## 
## Column names: from, op, to, mean, sd, 5%, 95%, rhat, ess_bulk

Compared to the basic CFA model, the RMSE drops from .063 to .028, suggesting a much better fitting model. All items load with 90% intervals excluding 0 on the general factor, except for x7. Additionally, x1 – x3 load confusingly on their specific factor, while x4 – x9 load strongly on their specific factors especially when compared to their general factor loadings. This pattern of factor loadings provide little support for a bifactor structure.

Bifactor model with parameter constraints

We instead explore a more constrained bifactor structure where specific factor loadings are forced equal within each specific factor. Note that minorbsem uses the same parameter constraint syntax as lavaan:

Model syntax

syntax_bifactor_cons <- paste0(
  "G =~ x1 + x2 + x3 + x4 + x5 + x6 + x7 + x8 + x9", "\n",
  "Visual =~ a * x1 + a * x2 + a * x3
  Verbal =~ b * x4 + b * x5 + b * x6
  Speed =~ c * x7 + c * x8 + c * x9"
)

Fit the model

fit_bifactor_cons <- fit_bifactor_cons <- minorbsem(
  syntax_bifactor_cons,
  data = HS, orthogonal = TRUE
)
##                     Parameter estimates (method = normal, sample size(s) = 301)                     
##              from     op   to        mean      sd   5.000%   95.000%    rhat   ess_bulk  
##            ──────────────────────────────────────────────────────────────────────────────
##              Goodness of fit                                                             
##            ──────────────────────────────────────────────────────────────────────────────
##              PPP                    0.397                              0.999       2216  
##              RMSE                   0.047   0.012    0.029     0.068   1.003       1038  
##            ──────────────────────────────────────────────────────────────────────────────
##              Factor loadings                                                             
##            ──────────────────────────────────────────────────────────────────────────────
##              G        =~   x1       0.864   0.113    0.674     1.049   1.001       1604  
##              G        =~   x2       0.433   0.117    0.229     0.617   1.001       1880  
##              G        =~   x3       0.575   0.114    0.385     0.761   1.000       1615  
##              G        =~   x4       0.499   0.101    0.338     0.669   1.001       1812  
##              G        =~   x5       0.496   0.114    0.319     0.687   1.002       1763  
##              G        =~   x6       0.474   0.095    0.322     0.633   1.001       1657  
##              G        =~   x7       0.132   0.097   -0.024     0.295   1.000       1943  
##              G        =~   x8       0.288   0.086    0.151     0.432   1.001       2261  
##              G        =~   x9       0.517   0.086    0.378     0.660   1.000       2507  
##              Visual   =~   x1       0.280   0.148    0.032     0.512   1.000       1606  
##              Visual   =~   x2       0.280   0.148    0.032     0.512   1.000       1606  
##              Visual   =~   x3       0.280   0.148    0.032     0.512   1.000       1606  
##              Verbal   =~   x4       0.864   0.061    0.764     0.961   1.002       1534  
##              Verbal   =~   x5       0.864   0.061    0.764     0.961   1.002       1534  
##              Verbal   =~   x6       0.864   0.061    0.764     0.961   1.002       1534  
##              Speed    =~   x7       0.605   0.054    0.515     0.689   1.000       2020  
##              Speed    =~   x8       0.605   0.054    0.515     0.689   1.000       2020  
##              Speed    =~   x9       0.605   0.054    0.515     0.689   1.000       2020  
##            ──────────────────────────────────────────────────────────────────────────────
##              Inter-factor correlations                                                   
##            ──────────────────────────────────────────────────────────────────────────────
##              Visual   ~~   G        0.000   0.000    0.000     0.000                     
##              Verbal   ~~   G        0.000   0.000    0.000     0.000                     
##              Verbal   ~~   Visual   0.000   0.000    0.000     0.000                     
##              Speed    ~~   G        0.000   0.000    0.000     0.000                     
##              Speed    ~~   Visual   0.000   0.000    0.000     0.000                     
##              Speed    ~~   Verbal   0.000   0.000    0.000     0.000                     
##            ──────────────────────────────────────────────────────────────────────────────
##              Residual variances                                                          
##            ──────────────────────────────────────────────────────────────────────────────
##              x1       ~~   x1       0.531   0.157    0.262     0.774   1.002       1539  
##              x2       ~~   x2       1.103   0.117    0.917     1.297   1.001       3244  
##              x3       ~~   x3       0.849   0.109    0.665     1.026   1.000       2363  
##              x4       ~~   x4       0.356   0.082    0.226     0.490   1.001       2262  
##              x5       ~~   x5       0.618   0.091    0.474     0.771   1.001       2220  
##              x6       ~~   x6       0.255   0.077    0.129     0.383   1.003       1883  
##              x7       ~~   x7       0.777   0.086    0.645     0.925   1.001       3312  
##              x8       ~~   x8       0.566   0.073    0.449     0.689   1.001       3029  
##              x9       ~~   x9       0.427   0.085    0.286     0.563   1.000       2152  
##            ──────────────────────────────────────────────────────────────────────────────
##                                                                                          
## 
## Column names: from, op, to, mean, sd, 5%, 95%, rhat, ess_bulk

The RMSE increased since the model is more constrained. The pattern of results with the parameter constraints imposed suggest the general factor mostly reflects items x1 – x3, with other items more strongly reflecting their specific factors. These results suggest limited applicability of the bifactor model for these data.

Non-Simple Structure Model

For our final model, we return to the original basic CFA and relax simple structure. Unlike Muthén and Asparouhov (2012) who do this using small-variance priors, minorbsem does this using a global-local prior (Uanhoro 2024). Precisely, this approach assumes that most cross-loadings are indeed zero and there are some outlier non-zero cross loadings.

Fit the model

The call to minorbsem() needs to be of the form: minorbsem(..., simple_struc = FALSE)

fit_non_simple <- minorbsem(syntax_basic, data = HS, simple_struc = FALSE)
##                     Parameter estimates (method = normal, sample size(s) = 301)                     
##             from     op   to         mean      sd   5.000%   95.000%    rhat   ess_bulk  
##           ───────────────────────────────────────────────────────────────────────────────
##             Goodness of fit                                                              
##           ───────────────────────────────────────────────────────────────────────────────
##             PPP                     0.452                              1.000       2189  
##             RMSE                    0.024   0.012    0.006     0.047   1.006        433  
##           ───────────────────────────────────────────────────────────────────────────────
##             Factor loadings                                                              
##           ───────────────────────────────────────────────────────────────────────────────
##             Visual   =~   x1        0.761   0.119    0.573     0.963   1.001        826  
##             Visual   =~   x2        0.559   0.101    0.392     0.726   1.000       1938  
##             Visual   =~   x3        0.754   0.110    0.580     0.937   1.000       1394  
##             Visual   =~   x4        0.025   0.070   -0.078     0.154   1.004       1456  
##             Visual   =~   x5       -0.059   0.087   -0.216     0.062   1.004       1509  
##             Visual   =~   x6        0.057   0.074   -0.042     0.188   1.003       1184  
##             Visual   =~   x7       -0.139   0.145   -0.420     0.028   1.002        956  
##             Visual   =~   x8        0.044   0.108   -0.118     0.247   1.003       1074  
##             Visual   =~   x9        0.317   0.115    0.125     0.504   1.001       1137  
##             Verbal   =~   x1        0.139   0.107   -0.013     0.324   1.001        970  
##             Verbal   =~   x2        0.015   0.066   -0.093     0.131   1.002       1606  
##             Verbal   =~   x3       -0.070   0.091   -0.238     0.040   1.002       1070  
##             Verbal   =~   x4        0.986   0.075    0.864     1.111   1.001       1584  
##             Verbal   =~   x5        1.137   0.090    0.993     1.295   1.002       1219  
##             Verbal   =~   x6        0.892   0.072    0.775     1.011   1.002       1479  
##             Verbal   =~   x7        0.023   0.071   -0.080     0.145   1.002       1758  
##             Verbal   =~   x8       -0.030   0.071   -0.161     0.069   1.002       1626  
##             Verbal   =~   x9        0.014   0.060   -0.080     0.115   1.001       1965  
##             Speed    =~   x1        0.034   0.080   -0.080     0.181   1.003       1734  
##             Speed    =~   x2       -0.052   0.078   -0.199     0.047   1.002       1895  
##             Speed    =~   x3        0.028   0.078   -0.085     0.163   1.002       1725  
##             Speed    =~   x4        0.003   0.054   -0.084     0.094   1.000       2407  
##             Speed    =~   x5        0.003   0.062   -0.096     0.106   1.000       2006  
##             Speed    =~   x6        0.002   0.053   -0.083     0.086   1.001       2015  
##             Speed    =~   x7        0.763   0.120    0.577     0.970   1.001        978  
##             Speed    =~   x8        0.759   0.099    0.601     0.924   1.006        948  
##             Speed    =~   x9        0.497   0.091    0.353     0.645   1.001       1321  
##           ───────────────────────────────────────────────────────────────────────────────
##             Inter-factor correlations                                                    
##           ───────────────────────────────────────────────────────────────────────────────
##             Verbal   ~~   Visual    0.356   0.121    0.150     0.541   1.000       1040  
##             Speed    ~~   Visual    0.284   0.162    0.003     0.533   1.001       1064  
##             Speed    ~~   Verbal    0.227   0.116    0.031     0.412   1.002       1482  
##           ───────────────────────────────────────────────────────────────────────────────
##             Residual variances                                                           
##           ───────────────────────────────────────────────────────────────────────────────
##             x1       ~~   x1        0.668   0.131    0.439     0.866   1.002        991  
##             x2       ~~   x2        1.092   0.115    0.908     1.291   1.002       2262  
##             x3       ~~   x3        0.737   0.127    0.524     0.936   1.000       1246  
##             x4       ~~   x4        0.372   0.085    0.228     0.502   1.003       1254  
##             x5       ~~   x5        0.425   0.113    0.243     0.610   1.001        847  
##             x6       ~~   x6        0.374   0.069    0.261     0.489   1.004       1660  
##             x7       ~~   x7        0.656   0.136    0.426     0.866   1.003        912  
##             x8       ~~   x8        0.458   0.119    0.252     0.641   1.003        882  
##             x9       ~~   x9        0.574   0.074    0.458     0.692   1.001       2281  
##           ───────────────────────────────────────────────────────────────────────────────
##                                                                                          
## 
## Column names: from, op, to, mean, sd, 5%, 95%, rhat, ess_bulk

The effect of minor factors is small, RMSE = 0.024. The original hypothesized loadings maintain their relation to their hypothesized factors. Of the cross-loadings, only the relation from the visual factor to x9 is non-trivial, with most being very close to 0. Additionally, the interfactor correlations have all reduced from the original basic CFA, suggesting that forcing cross-loadings to zero artificially inflated interfactor correlations (Ferrando and Lorenzo-Seva 2000).

Works Cited

Ferrando, Pere J, and Urbano Lorenzo-Seva. 2000. “Unrestricted Versus Restricted Factor Analysis of Multidimensional Test Items: Some Aspects of the Problem and Some Suggestions.” Psicológica 21 (2): 301–23. https://www.uv.es/revispsi/articulos3.00/ferran7.pdf.
Maydeu-Olivares, Alberto. 2017. “Assessing the Size of Model Misfit in Structural Equation Models.” Psychometrika 82 (3): 533–58. https://doi.org/10.1007/s11336-016-9552-7.
Muthén, Bengt, and Tihomir Asparouhov. 2012. “Bayesian Structural Equation Modeling: A More Flexible Representation of Substantive Theory.” Psychological Methods 17 (3): 313–35. https://doi.org/10.1037/a0026802.
Rosseel, Yves. 2012. “Lavaan: An R Package for Structural Equation Modeling.” Journal of Statistical Software 48 (2): 1–20. https://doi.org/10.18637/jss.v048.i02.
Uanhoro, James O. 2024. “A Comparison of Different Prior Choices for Estimating the Influence of Minor Factors in Bayesian Structural Equation Models.” In Annual Meeting of the American Educational Research Association. Philadelphia, PA.
Vehtari, Aki, Andrew Gelman, Daniel Simpson, Bob Carpenter, and Paul-Christian Bürkner. 2021. “Rank-Normalization, Folding, and Localization: An Improved \(\widehat{R}\) for Assessing Convergence of MCMC (with Discussion).” Bayesian Analysis 16 (2): 667–718. https://doi.org/10.1214/20-BA1221.