I´m trying to fit some path models (i.e. all variables are observed; no latent variables) using “lavaan” in R. I´ve been able to do this successfully for a model where the data are completely pooled (Model 1, below). But, the data are grouped and I´d like to fit a models that account for groups as fixed effects (Model 2, below) and random effect (i.e. random intercept by group; Model 3, below).
I´ve looked at the user manual and various other online resources, but I´m having trouble working out how to code the fixed and random effects models.
I´m hoping someone might be able to provide some advice on this.
I´ve include simplified versions of the data and models I´m trying to fit below. (I´m using a path model as the real data includes more predictors and indirect paths).
Dataset: the variables are 4 predictors (P1-4); 1 outcome (Outcome); 4 groups (each observation falls within one of four groups: G1-4 are dummy variables). All variables are observed (i.e. no latent variables).
Model 1: path model without accounting for groups (i.e. complete pooling)
This appears to work fine.
model1 <- "
#regression equations
P2 ~ P1
outcome ~ P1 + P2 + P3 + P4
# variance of exogenous vars
P1 ~~ P1
P3 ~~ P3
P4 ~~ P4
# covariance of exogenous vars
P3 ~~ P4
# residual var for endog
P2 ~~ P2
outcome ~~ outcome
# covar of endog vars (none)
"
fit1 <- lavaan(model1, data=mydata)
Model 2: group fixed effects
I´m not sure how to do this…
Question: Is this done by including all but one of the group dummy variables as exogenous variables, specifying paths from each dummy variable to the outcome, as well as including a variance term for each dummy? That is:
Model2 <- "
#regression equations
P2 ~ P1
outcome ~ P1 + P2 + P3 + P4 + **G2 + G3 + G4**
#variance of exogenous vars
P1 ~~ P1
P3 ~~ P3
P4 ~~ P4
**G2 ~~ G2**
**G3 ~~ G3**
**G4 ~~ G4**
# covariance of exogenous vars
P3 ~~ P4
# residual var for endog
P2 ~~ P2
outcome ~~ outcome
# covar of endog vars (none)
"
fit2 <- lavaan(model2, data=mydata)
Model 3: random intercept for groups
I see you need to specify the level 1 (observation level) and level 2 (group level) equations. I´m not sure how to do it correctly, but my attempt is below.
Question: What is the correct way to specify a model that has random intercepts for groups? And, when fitting the model, how do I specify cluster correctly?
Model3 <- "
#regression equations
**level 1:**
P2 ~ P1
outcome ~ P1 + P2 + P3 + P4
**level 2:**
outcome ~ **G2 + G3 + G4**
# variance of exogenous vars
P1 ~~ P1
P3 ~~ P3
P4 ~~ P4
**G2 ~~ G2**
**G3 ~~ G3**
**G4 ~~ G4**
# covariance of exogenous vars
P3 ~~ P4
# residual var for endog
P2 ~~ P2
outcome ~~ outcome
# covar of endog vars (none)
"
fit3 <- lavaan(model3, data=mydata, **cluster =”????”**)
Means and (co)variances of exogenous predictors are by default (fixed.x=TRUE
) taken as given, so there is no need to estimate them (i.e., you can leave them out of your model syntax).
In Model 3, leave the G dummy codes out of the model. Use the name of the original grouping variable (with 4 levels) as the cluster=
argument, which will invoke random intercepts for all modeled variables. Or, if you only specify a single-level model, the cluster=
arguments triggers cluster-robust SEs and test statistics. That might be better than random intercepts because you only have $N=4$ at Level 2. ML-SEM gives highly biased estimates in small samples. But perhaps that is what your comparison of approaches is meant to demonstrate.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With