Issue when trying to fit a linear mixed effects model using Rstan

These priors seem like they might be trouble; do they truly encode your domain expertise? For example, the inverse gamma priors have vanishingly small mass for values less than about 2000 and have rather heavy tails such that 10% of the mass is above 10,000. And an SD of 10^4 for the normal prior conveys that your uncertainty spans 4 orders of magnitude, which seems extreme.

Maybe these are proper encodings of your domain expertise, in which case I’d suggest that you might get some benefit from rescaling the data so that the scales of the parameters are in the ± 10 range, as the adaptation parts of Stan do better when parameters are in this range.

1 Like

Thanks for your response. I tried what you advised. Now the chains run completely quickly. But, I get the following error/warning messages.

Warning messages:
1: There were 229 divergent transitions after warmup. Increasing adapt_delta above 0.99 may help. See
http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup 
2: Examine the pairs() plot to diagnose sampling problems
 
3: The largest R-hat is Inf, indicating chains have not mixed.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#r-hat 
4: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#bulk-ess 
5: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#tail-ess

What did you set your priors to? As @mike-lawrence pointed out you will want some slight to informed priors. I would also set your iterations to 2000 to start with.

@Ara_Winter
Now the priors look like this:

// Prior part of Bayesian inference
sigma ~ inv_gamma(1, 10);
tau ~ inv_gamma(1, 10);
alpha ~ normal(0, 10);
H ~ uniform(0, 1);
kappa ~ inv_gamma(1, 1);

I also decreased the iterations to 2000. I still get those warning messages with “Divergent transitions after warmup”.

1 Like

I tried to run this code for one chain, with just 2000 iterations, and without passing any initial values for the parameters. I got the following error. Now I’m not sure whether my model is right. I’m pretty new to Stan. So I don’t know what this means. Any help would be much appreciated.

[1] "Error in sampler$call_sampler(args_list[[i]]) : " "  c++ exception (unknown reason)"                
error occurred during calling the sampler; sampling not done

Can you tell me the machine and OS (with version) you are running on?

1 Like

@Ara_Winter
I’m using a MacOS Catalina, Version 10.15.1 (19B88).
Processor: 1.6 GHz Dual-Core Intel Core i5
Memory: 8 GB 1600 MHz DDR3

Thanks!

Ah. That error comes up with Catalina. It might also be caused by something else too.

I think @bgoodri had a fix. See this thread here:

What is your version of R and rstan?

1 Like

@Ara_Winter Thanks, I’ll check those links/fixes out on my Mac.

The version of R I’m using is, R version 3.6.1 and the rstan version is 2.19.1.

BTW, I’m started running this same code in a windows machine after seeing @mike-lawrence’s reply about priors.

Click the Refresh button to see progress of the chains
starting worker pid=1340 on localhost:11532 at 19:10:16.821
starting worker pid=10276 on localhost:11532 at 19:10:17.702
starting worker pid=6988 on localhost:11532 at 19:10:18.571
starting worker pid=2124 on localhost:11532 at 19:10:19.512

SAMPLING FOR MODEL 'init' NOW (CHAIN 1).
Chain 1: 
Chain 1: Gradient evaluation took 0.063 seconds
Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 630 seconds.
Chain 1: Adjust your expectations accordingly!
Chain 1: 
Chain 1: 
Chain 1: Iteration:    1 / 5000 [  0%]  (Warmup)

SAMPLING FOR MODEL 'init' NOW (CHAIN 2).
Chain 2: 
Chain 2: Gradient evaluation took 0 seconds
Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
Chain 2: Adjust your expectations accordingly!
Chain 2: 
Chain 2: 
Chain 2: Iteration:    1 / 5000 [  0%]  (Warmup)

SAMPLING FOR MODEL 'init' NOW (CHAIN 3).
Chain 3: 
Chain 3: Gradient evaluation took 0.081 seconds
Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 810 seconds.
Chain 3: Adjust your expectations accordingly!
Chain 3: 
Chain 3: 
Chain 3: Iteration:    1 / 5000 [  0%]  (Warmup)

SAMPLING FOR MODEL 'init' NOW (CHAIN 4).
Chain 4: 
Chain 4: Gradient evaluation took 0.11 seconds
Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 1100 seconds.
Chain 4: Adjust your expectations accordingly!
Chain 4: 
Chain 4: 
Chain 4: Iteration:    1 / 5000 [  0%]  (Warmup)
Chain 1: Iteration:  500 / 5000 [ 10%]  (Warmup)
Chain 3: Iteration:  500 / 5000 [ 10%]  (Warmup)
Chain 4: Iteration:  500 / 5000 [ 10%]  (Warmup)
Chain 3: Iteration: 1000 / 5000 [ 20%]  (Warmup)
Chain 2: Iteration:  500 / 5000 [ 10%]  (Warmup)
Chain 3: Iteration: 1500 / 5000 [ 30%]  (Warmup)
Chain 1: Iteration: 1000 / 5000 [ 20%]  (Warmup)
Chain 4: Iteration: 1000 / 5000 [ 20%]  (Warmup)
Chain 4: Iteration: 1500 / 5000 [ 30%]  (Warmup)
Chain 3: Iteration: 2000 / 5000 [ 40%]  (Warmup)
Chain 4: Iteration: 2000 / 5000 [ 40%]  (Warmup)
Chain 3: Iteration: 2500 / 5000 [ 50%]  (Warmup)
Chain 3: Iteration: 2501 / 5000 [ 50%]  (Sampling)
Chain 4: Iteration: 2500 / 5000 [ 50%]  (Warmup)
Chain 4: Iteration: 2501 / 5000 [ 50%]  (Sampling)
Chain 4: Iteration: 3000 / 5000 [ 60%]  (Sampling)
Chain 4: Iteration: 3500 / 5000 [ 70%]  (Sampling)
Chain 4: Iteration: 4000 / 5000 [ 80%]  (Sampling)
Chain 4: Iteration: 4500 / 5000 [ 90%]  (Sampling)
Chain 4: Iteration: 5000 / 5000 [100%]  (Sampling)
Chain 4: 
Chain 4:  Elapsed Time: 1551.54 seconds (Warm-up)
Chain 4:                2991.6 seconds (Sampling)
Chain 4:                4543.14 seconds (Total)
Chain 4: 
Chain 2: Iteration: 1000 / 5000 [ 20%]  (Warmup)
Chain 1: Iteration: 1500 / 5000 [ 30%]  (Warmup)
Chain 3: Iteration: 3000 / 5000 [ 60%]  (Sampling)
Chain 3: Iteration: 3500 / 5000 [ 70%]  (Sampling)

And the last four steps started yesterday and still running. Maybe there’s something wrong with my model…???

Hmmm there might be! Let me take a look at the model. What does you data look like? Or at least a simulation of your data if you can’t share the actually data.

Thanks to you and @bgoodri I managed to install Rcpp correctly. I don’t get the compilation error anymore.

Since I’m not at liberty to post my original dataset. So, a simulated dataset like mine simulated_data.csv (31.2 KB) is attached here. I also attached the rstan code for_sim_data.stan (2.0 KB) and r code for_sim_data.R (1.3 KB) I use to run the model. I still get divergent transitions when I run this!

Thanks for all your help! :)

2 Likes

Ok cool. Glad Rcpp is up and going. Have you run the pair plots? And do you have tidybayes install so we can look at the diagnostics plots.

Thanks for the sim data, code, and R script. I’ll pull those down and take a look.

So the brute force method is just to crank up the number of iterations. However this is not very elegant and may not solve the problem ;)

2 Likes

:) I have not used tidybayes before. But I installed the package now and will check it out! Thanks for the suggestion. I started running the model just now. I will upload the pairs() plot after it gets completed! Not sure it will though. Because I tried running 4 chains, 2000 iterations with initial values. 2 chains are already completed. one is 90% completed. But the last one is still 10% complete. That chain seems to lag/stuck! If I run this model without passing initial values it throws errors (like the one below) for the initial values rstan selects.

Rejecting initial value:
Chain 2:   Error evaluating the log probability at the initial value.
Chain 2: Exception: Exception: cholesky_decompose: Matrix m is not positive definite

The trace plots look like this!

This is the inference.

Inference for Stan model: init.
4 chains, each with iter=2000; warmup=1000; thin=10; 
post-warmup draws per chain=100, total post-warmup draws=400.

         mean  se_mean       sd    2.5%     25%     50%      75%    97.5% n_eff    Rhat
alpha[1]     0.47     0.02     0.03    0.42    0.46    0.47     0.47     0.50     2    7.05
alpha[2]    -0.12     0.03     0.04   -0.19   -0.17   -0.11    -0.09    -0.08     2    6.24
alpha[3]    -0.07     0.05     0.07   -0.19   -0.07   -0.04    -0.02    -0.02     2    9.60
alpha[4]     0.11     0.04     0.07    0.05    0.06    0.08     0.11     0.24     2    3.91
alpha[5]    -0.21     0.25     0.38   -0.97   -0.11   -0.02     0.03     0.03     2    3.12
alpha[6]    -0.02     0.01     0.03   -0.07   -0.04   -0.02     0.00     0.04     3    1.88
sigma    25165.72 26602.77 37716.41  925.58 1951.94 4724.39 27855.82 90313.34     2 2030.32
tau       1695.27   175.49   249.58 1245.50 1585.56 1745.62  1875.29  1947.82     2   15.06
kappa        0.38     0.25     0.36    0.00    0.18    0.29     0.41     1.00     2    9.08
H            0.03     0.03     0.05    0.00    0.00    0.00     0.02     0.16     2    6.04

Samples were drawn using NUTS(diag_e) at Mon May 25 19:47:18 2020.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at 
convergence, Rhat=1).
Warning message:
In UseMethod("depth") :
  no applicable method for 'depth' applied to an object of class "NULL"

and the pairs(). I don’t know much about how to examine a pairs plot. But I know that this couldn’t look worse :(

I will change the initial values and run again.

Yeah looks like something is going sideways. I likely won’t get to this until tomorrow ( Nuevo México, United States time).

1 Like

No Problem. I appreciate how much you’ve helped me already. :)

Have you plotted your data? I know you said you can’t share it but does anything look out of sorts?

This the plot of outcome Vs. time variables. The trajectories are for just a few subjects.
trajec

And the summary stats don’t look out of sorts to me. I ran the random-intercepts model without a problem. The trace plots got out of shape when I added the stochastic process.

Thanks for looking into this! :)

1 Like

I am not as familiar with cholesky_decompose errors. I know this came up over here My model with individual level random effects and group effects is really slow and that might be of some help.

Maybe @mike-lawrence knows more about this or knows someone that does? Also tagging in @bbbales2

1 Like

Starting to get out of my expertise here, but sigma and tau still seem to have very large values in the posterior, and very large variance between chains on each. Is there a way to rescale the data so that these parameters are approximately on the ±10 scale?

1 Like