Hi, I’m having trouble with convergence in one of my models and wanted to ask if there’s something really obvious I’m doing wrong. I’m relatively new to Bayesian stats and haven’t posted here before so please forgive any errors and let me know if there’s something else I should include/do differently.
I have a dataset that looks like this:
>head(df)
group PC1 k_mean k_me
1 1.9759287 7.0 1.0
2 -1.9759287 3.5 1.5
3 -1.5508796 5.0 3.0
4 0.8288762 6.0 2.0
5 -1.1907181 5.0 3.0
The predictor is the first principal component from other continuous measures, and the outcome variable was originally scored as a single value from 1 to 8 for some groups or as ranges (e.g. 2-5, 3-4) or for others. I tried to incorporate the uncertainty originally captured by these ranges as measurement error (mostly following the implementation in Ch 15 of McElreath’s book where he models observed divorce rates across US states as distributed around the unknown true value with a standard deviation calculated from measurement error) (please also let me know if this is an obviously terrible way of doing this). I have tried running it without the measurement error and it works fine, so I assume it has something to do with my implementation there. I have tried it with simulated data where I knew the relationship between the predictor and outcome and it gave the same issues.
Here is my data list and model (using the rethinking package)
dlist <- list(
k_obs = df$k_mean, #outcome variable ranging from 1 to 8
k_me = df$k_me #error around outcome ranging from 0 to 3
k_sd = df$k_me/sd(df$k_mean),
p = standardize(df$PC1), #mean = 0, sd = 1
N = nrow(df)
)
mod1 <- ulam(
alist(
k_obs ~ dnorm(k_true , k_sd),
vector[N]:k_true ~ dnorm(mu, sigma),
mu <- a + bP*p,
a ~ dnorm(4.5, 1.2),
bP ~ dnorm(0, 1),
sigma ~ dexp(1)
), control = list( adapt_delta = 0.99,
max_treedepth = 13),
data=dlist, chains=4, cores=4, cmdstan=T)
I get the following messages:
"Chain 4 Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
Chain 4 Exception: normal_lpdf: Scale parameter is 0, but must be positive! (in ‘/var/folders/y1/jd6dgn9d12v5xfgbz7q5ttph0000gn/T/RtmpM5vJi8/model-ce86667fbf3.stan’, line 24, column 4 to column 34)
Chain 4 If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
Chain 4 but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.
Chain 4
All 4 chains finished successfully.
Mean chain execution time: 6.4 seconds.
Total execution time: 6.6 seconds.
2000 of 2000 (100.0%) transitions hit the maximum treedepth limit of 10 or 2^10-1 leapfrog steps.
Trajectories that are prematurely terminated due to this limit will result in slow exploration.
Increasing the max_treedepth limit can avoid this at the expense of more computation.
If increasing max_treedepth does not remove warnings, try to reparameterize the model."
And these ESS and Rhats, which…don’t seem good:
> precis(mod1, depth=2)
mean sd 5.5% 94.5% n_eff Rhat4
k_true[1] -0.37 0.55 -0.94 0.22 2 1738654012688586
k_true[2] 0.27 1.15 -1.27 1.97 2 767807965779033
k_true[3] 0.19 1.15 -1.33 1.84 2 974503176613239
k_true[4] 0.08 0.94 -1.50 0.93 2 2752906058414171
…
Again, I’m new here so if I should have added additional/different (or less) information, or should have posted this elsewhere please do let me know. I did find some previous threads about measurement error but couldn’t figure out how to use the suggestions to fix my own model (partly because I’m not familiar with base Stan). I assume I’m not incorporating the uncertainty in my outcome measure in the model particularly well but am not sure how else to do it. Any help is much appreciated. Thanks!