Sampling is done for the incorrect model (has something to do with auto_write = TRUE?)

Hi folks,

I am trying to fit models using the Stan code produced from the blavaan package. The issue is that after I fit a first model and then do it for a second model, sampling is done not for the second model but for the first model. I do not understand why this is happening, so I would like to get some solutions from others. I first attach the code I used.

library(rstan)
options(mc.cores = 2)
rstan_options(auto_write = TRUE)
library(blavaan)

# 1. Data simulation
# True model
model.true <- '
# measurement equation
fac1 =~ x1 + 0.5*x2 + 0.5*x3
fac2 =~ x4 + 0.5*x5 + 0.5*x6
fac3 =~ x7 + 0.5*x8 + 0.5*x9

# structural equation
fac1 ~~ 0.5*fac2
fac1 ~~ 0.5*fac3
fac2 ~~ 0.5*fac3
'

set.seed(322)
sim.dat <- simulateData(model.true, sample.nobs = 500)

# 2. Models for comparison
model.1 <- '
# measurement equation
fac1 =~ x1 + x2 + x3
fac2 =~ x4 + x5 + x6
fac3 =~ x7 + x8 + x9

# structural equation
fac1 ~~ fac2
fac1 ~~ fac3
fac2 ~~ 0*fac3
'

model.2 <- '
# measurement equation
fac1 =~ x1 + x2 + x3
fac2 =~ x4 + x5 + x6
fac3 =~ x7 + x8 + x9

# structural equation
fac2 ~~ fac1
fac2 ~~ fac3
fac1 ~~ 0*fac3
'

With this simulated dataset and the two models, I fit models using Stan code produced from blavaan. I first get the Stan code from blavaan.

# 3. Fitting models using Stan code from blavaan
# Model 1
set.seed(322)
blav.fit.1 <- bsem(model.1, data=sim.dat, n.chains=2, mcmcfile="comparison", do.fit=FALSE)

Please note that, then, I renamed the two files (semstan.rda and sem.stan into semm1.rda and semm1.stan in the comparison folder). This procedure should be done to reproduce my issue!

load('comparison/semm1.rda')
set.seed(322)
stan.fit.1 <- stan('comparison/semm1.stan', data=stantrans$data, chains=2,
                    iter=1500, warmup=500, init=stantrans$inits)

The fitting is successfully completed. Then, for the second model,

# Model 2
set.seed(322)
blav.fit.2 <- bsem(model.2, data=sim.dat, n.chains=2, mcmcfile="comparison", do.fit=FALSE)

Please note that I again renamed the two files (semstan.rda and sem.stan into semm2.rda and semm2.stan in the comparison folder).

As soon as I run the following code…

load('comparison/semm2.rda')
set.seed(322)
stan.fit.2 <- stan('comparison/semm2.stan', data=stantrans$data, chains=2,
                   iter=1500, warmup=500, init=stantrans$inits)

In the ‘Viewer’, the message appears such that sampling happens for model ‘semm1’ not ‘semm2’!!!:

Click the Refresh button to see progress of the chains
starting worker pid=22092 on localhost:11616 at 21:53:37.407
starting worker pid=7916 on localhost:11616 at 21:53:37.785

SAMPLING FOR MODEL 'semm1' NOW (CHAIN 1).
Chain 1: 
Chain 1: Gradient evaluation took 0.017 seconds
Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 170 seconds.
Chain 1: Adjust your expectations accordingly!
Chain 1: 
Chain 1: 
Chain 1: Iteration:    1 / 1500 [  0%]  (Warmup)

SAMPLING FOR MODEL 'semm1' NOW (CHAIN 2).
Chain 2: 
Chain 2: Gradient evaluation took 0 seconds
Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
Chain 2: Adjust your expectations accordingly!
Chain 2: 
Chain 2: 
Chain 2: Iteration:    1 / 1500 [  0%]  (Warmup)
Chain 2: Iteration:  150 / 1500 [ 10%]  (Warmup)
Chain 1: Iteration:  150 / 1500 [ 10%]  (Warmup)
Chain 2: Iteration:  300 / 1500 [ 20%]  (Warmup)
Chain 1: Iteration:  300 / 1500 [ 20%]  (Warmup)
Chain 2: Iteration:  450 / 1500 [ 30%]  (Warmup)
Chain 1: Iteration:  450 / 1500 [ 30%]  (Warmup)
Chain 2: Iteration:  501 / 1500 [ 33%]  (Sampling)
Chain 1: Iteration:  501 / 1500 [ 33%]  (Sampling)
Chain 2: Iteration:  650 / 1500 [ 43%]  (Sampling)
Chain 1: Iteration:  650 / 1500 [ 43%]  (Sampling)
Chain 2: Iteration:  800 / 1500 [ 53%]  (Sampling)
Chain 1: Iteration:  800 / 1500 [ 53%]  (Sampling)
Chain 2: Iteration:  950 / 1500 [ 63%]  (Sampling)
Chain 1: Iteration:  950 / 1500 [ 63%]  (Sampling)
Chain 2: Iteration: 1100 / 1500 [ 73%]  (Sampling)
Chain 1: Iteration: 1100 / 1500 [ 73%]  (Sampling)
Chain 2: Iteration: 1250 / 1500 [ 83%]  (Sampling)
Chain 1: Iteration: 1250 / 1500 [ 83%]  (Sampling)
Chain 2: Iteration: 1400 / 1500 [ 93%]  (Sampling)
Chain 1: Iteration: 1400 / 1500 [ 93%]  (Sampling)
Chain 2: Iteration: 1500 / 1500 [100%]  (Sampling)
Chain 2: 
Chain 2:  Elapsed Time: 30.131 seconds (Warm-up)
Chain 2:                37.617 seconds (Sampling)
Chain 2:                67.748 seconds (Total)
Chain 2: 
Chain 1: Iteration: 1500 / 1500 [100%]  (Sampling)
Chain 1: 
Chain 1:  Elapsed Time: 33.04 seconds (Warm-up)
Chain 1:                37.227 seconds (Sampling)
Chain 1:                70.267 seconds (Total)
Chain 1: 

I do not know why this is happening. Is this okay and a naturally-happening process where the semm2 is being sampled actually? Or, is this really a problem? My guess is that the functionrstan_options(auto_write = TRUE) should have something to do with it, but I do not know.

So, could anyone help me in figuring out this problem?

Thanks!

I’ve never used blavaan, but this might be a bug in blavaan (certainly doesn’t happen with auto_write = TRUE for brms or pure Stan models)… Tagging @bgoodri as one of the authors to check…

1 Like

Hi @martinmodrak, thanks for your reply and referring @bgoodri.
It would be nice if I could know whether the model is fitted correctly regardless of this ‘message’ bug or the model has been incorrectly fitted so far. In either case, could you help me solve this issue?

Sorry for missing this, it seems @bgoodri is busy. Maybe consider filing an issue on the blavaan’s repo? In any case if the message IMHO indicates the name of the model being sampled when compiled so I would guess the wrong model is most likely being sampled, but I am not completely certain.