What did you set your priors to? As @mike-lawrence pointed out you will want some slight to informed priors. I would also set your iterations to 2000 to start with.
Now the priors look like this:
// Prior part of Bayesian inference sigma ~ inv_gamma(1, 10); tau ~ inv_gamma(1, 10); alpha ~ normal(0, 10); H ~ uniform(0, 1); kappa ~ inv_gamma(1, 1);
I also decreased the iterations to 2000. I still get those warning messages with “Divergent transitions after warmup”.
I tried to run this code for one chain, with just 2000 iterations, and without passing any initial values for the parameters. I got the following error. Now I’m not sure whether my model is right. I’m pretty new to Stan. So I don’t know what this means. Any help would be much appreciated.
 "Error in sampler$call_sampler(args_list[[i]]) : " " c++ exception (unknown reason)" error occurred during calling the sampler; sampling not done
Can you tell me the machine and OS (with version) you are running on?
I’m using a MacOS Catalina, Version 10.15.1 (19B88).
Processor: 1.6 GHz Dual-Core Intel Core i5
Memory: 8 GB 1600 MHz DDR3
Ah. That error comes up with Catalina. It might also be caused by something else too.
I think @bgoodri had a fix. See this thread here:
What is your version of R and rstan?
@Ara_Winter Thanks, I’ll check those links/fixes out on my Mac.
The version of R I’m using is, R version 3.6.1 and the rstan version is 2.19.1.
BTW, I’m started running this same code in a windows machine after seeing @mike-lawrence’s reply about priors.
Click the Refresh button to see progress of the chains starting worker pid=1340 on localhost:11532 at 19:10:16.821 starting worker pid=10276 on localhost:11532 at 19:10:17.702 starting worker pid=6988 on localhost:11532 at 19:10:18.571 starting worker pid=2124 on localhost:11532 at 19:10:19.512 SAMPLING FOR MODEL 'init' NOW (CHAIN 1). Chain 1: Chain 1: Gradient evaluation took 0.063 seconds Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 630 seconds. Chain 1: Adjust your expectations accordingly! Chain 1: Chain 1: Chain 1: Iteration: 1 / 5000 [ 0%] (Warmup) SAMPLING FOR MODEL 'init' NOW (CHAIN 2). Chain 2: Chain 2: Gradient evaluation took 0 seconds Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds. Chain 2: Adjust your expectations accordingly! Chain 2: Chain 2: Chain 2: Iteration: 1 / 5000 [ 0%] (Warmup) SAMPLING FOR MODEL 'init' NOW (CHAIN 3). Chain 3: Chain 3: Gradient evaluation took 0.081 seconds Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 810 seconds. Chain 3: Adjust your expectations accordingly! Chain 3: Chain 3: Chain 3: Iteration: 1 / 5000 [ 0%] (Warmup) SAMPLING FOR MODEL 'init' NOW (CHAIN 4). Chain 4: Chain 4: Gradient evaluation took 0.11 seconds Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 1100 seconds. Chain 4: Adjust your expectations accordingly! Chain 4: Chain 4: Chain 4: Iteration: 1 / 5000 [ 0%] (Warmup) Chain 1: Iteration: 500 / 5000 [ 10%] (Warmup) Chain 3: Iteration: 500 / 5000 [ 10%] (Warmup) Chain 4: Iteration: 500 / 5000 [ 10%] (Warmup) Chain 3: Iteration: 1000 / 5000 [ 20%] (Warmup) Chain 2: Iteration: 500 / 5000 [ 10%] (Warmup) Chain 3: Iteration: 1500 / 5000 [ 30%] (Warmup) Chain 1: Iteration: 1000 / 5000 [ 20%] (Warmup) Chain 4: Iteration: 1000 / 5000 [ 20%] (Warmup) Chain 4: Iteration: 1500 / 5000 [ 30%] (Warmup) Chain 3: Iteration: 2000 / 5000 [ 40%] (Warmup) Chain 4: Iteration: 2000 / 5000 [ 40%] (Warmup) Chain 3: Iteration: 2500 / 5000 [ 50%] (Warmup) Chain 3: Iteration: 2501 / 5000 [ 50%] (Sampling) Chain 4: Iteration: 2500 / 5000 [ 50%] (Warmup) Chain 4: Iteration: 2501 / 5000 [ 50%] (Sampling) Chain 4: Iteration: 3000 / 5000 [ 60%] (Sampling) Chain 4: Iteration: 3500 / 5000 [ 70%] (Sampling) Chain 4: Iteration: 4000 / 5000 [ 80%] (Sampling) Chain 4: Iteration: 4500 / 5000 [ 90%] (Sampling) Chain 4: Iteration: 5000 / 5000 [100%] (Sampling) Chain 4: Chain 4: Elapsed Time: 1551.54 seconds (Warm-up) Chain 4: 2991.6 seconds (Sampling) Chain 4: 4543.14 seconds (Total) Chain 4: Chain 2: Iteration: 1000 / 5000 [ 20%] (Warmup) Chain 1: Iteration: 1500 / 5000 [ 30%] (Warmup) Chain 3: Iteration: 3000 / 5000 [ 60%] (Sampling) Chain 3: Iteration: 3500 / 5000 [ 70%] (Sampling)
And the last four steps started yesterday and still running. Maybe there’s something wrong with my model…???
Hmmm there might be! Let me take a look at the model. What does you data look like? Or at least a simulation of your data if you can’t share the actually data.
Thanks to you and @bgoodri I managed to install Rcpp correctly. I don’t get the compilation error anymore.
Since I’m not at liberty to post my original dataset. So, a simulated dataset like mine simulated_data.csv (31.2 KB) is attached here. I also attached the rstan code for_sim_data.stan (2.0 KB) and r code for_sim_data.R (1.3 KB) I use to run the model. I still get divergent transitions when I run this!
Thanks for all your help! :)
Ok cool. Glad Rcpp is up and going. Have you run the pair plots? And do you have tidybayes install so we can look at the diagnostics plots.
Thanks for the sim data, code, and R script. I’ll pull those down and take a look.
So the brute force method is just to crank up the number of iterations. However this is not very elegant and may not solve the problem ;)
:) I have not used tidybayes before. But I installed the package now and will check it out! Thanks for the suggestion. I started running the model just now. I will upload the pairs() plot after it gets completed! Not sure it will though. Because I tried running 4 chains, 2000 iterations with initial values. 2 chains are already completed. one is 90% completed. But the last one is still 10% complete. That chain seems to lag/stuck! If I run this model without passing initial values it throws errors (like the one below) for the initial values rstan selects.
Rejecting initial value: Chain 2: Error evaluating the log probability at the initial value. Chain 2: Exception: Exception: cholesky_decompose: Matrix m is not positive definite
The trace plots look like this!
This is the inference.
Inference for Stan model: init. 4 chains, each with iter=2000; warmup=1000; thin=10; post-warmup draws per chain=100, total post-warmup draws=400. mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat alpha 0.47 0.02 0.03 0.42 0.46 0.47 0.47 0.50 2 7.05 alpha -0.12 0.03 0.04 -0.19 -0.17 -0.11 -0.09 -0.08 2 6.24 alpha -0.07 0.05 0.07 -0.19 -0.07 -0.04 -0.02 -0.02 2 9.60 alpha 0.11 0.04 0.07 0.05 0.06 0.08 0.11 0.24 2 3.91 alpha -0.21 0.25 0.38 -0.97 -0.11 -0.02 0.03 0.03 2 3.12 alpha -0.02 0.01 0.03 -0.07 -0.04 -0.02 0.00 0.04 3 1.88 sigma 25165.72 26602.77 37716.41 925.58 1951.94 4724.39 27855.82 90313.34 2 2030.32 tau 1695.27 175.49 249.58 1245.50 1585.56 1745.62 1875.29 1947.82 2 15.06 kappa 0.38 0.25 0.36 0.00 0.18 0.29 0.41 1.00 2 9.08 H 0.03 0.03 0.05 0.00 0.00 0.00 0.02 0.16 2 6.04 Samples were drawn using NUTS(diag_e) at Mon May 25 19:47:18 2020. For each parameter, n_eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence, Rhat=1). Warning message: In UseMethod("depth") : no applicable method for 'depth' applied to an object of class "NULL"
and the pairs(). I don’t know much about how to examine a pairs plot. But I know that this couldn’t look worse :(
I will change the initial values and run again.
Yeah looks like something is going sideways. I likely won’t get to this until tomorrow ( Nuevo México, United States time).
No Problem. I appreciate how much you’ve helped me already. :)
Have you plotted your data? I know you said you can’t share it but does anything look out of sorts?
This the plot of outcome Vs. time variables. The trajectories are for just a few subjects.
And the summary stats don’t look out of sorts to me. I ran the random-intercepts model without a problem. The trace plots got out of shape when I added the stochastic process.
Thanks for looking into this! :)
I am not as familiar with cholesky_decompose errors. I know this came up over here My model with individual level random effects and group effects is really slow and that might be of some help.
Starting to get out of my expertise here, but sigma and tau still seem to have very large values in the posterior, and very large variance between chains on each. Is there a way to rescale the data so that these parameters are approximately on the ±10 scale?
Actually started looking at the data and I think maybe I can provide some expertise in querying your model choice (rather than attempting to diagnose why your current choice isn’t working). The data seem to be structured such that you have multiple individuals, each with a unique set of predictors (
ppdlco), and multiple measures of the outcome
FEV1 obtained at time intervals specified by
timeSince_t0. You model the influence of the predictors as a simple linear effect on the mean outcome and you model the influence of time as a Brownian stochastic process. I’m not familiar with the latter; is it common in your field? Does it encode deep knowledge of the data generating process? If not, then I’d suggest maybe taking a look at the more common (to my admittedly limited exposure) approach of a Gaussian Process (indeed, I suspect that there is probably a mathematical connection between Brownian stochastic process and a Gaussian process). GPs have been decently explored by Stan folks and you may be able to leverage more expertise by using that approach than with something you coded by hand yourself.
Oh, and note that the high correlation between the
menopause variables might induce inefficient sampling that you could avoid by using the QR trick.
I actually thought about re-scaling/centering data. But I can only center age, ppfev1 and ppdlco. I will scale that and see whether it changes anything.
Yes, you are correct. My model is a random intercept model with a Brownian motion stochastic process. Random intercept is used to describe between-subject variation and Brownian motion is used to capture the evolution of outcome variable through time within each subject.
In biostatistics we commonly use BM and related stochastic processes to capture the within-subject variations. According to a preliminary study I did using the r package “lmenssp”, integrated Ornstein Uhlenbeck (IOU); which is related to scaled BM captures, within-subject variation well. So my plan was to use a more generalized version of BM, which is fractional BM for this data.
Yes, you are correct. BM and fractional BM are both (rather can be) gaussian processes. I will try to use the links you attached for my model. Thank you! :)