Problem running a 3 population Lotka Volterra model (rejecting initial and max number of iterations exceeded)

fitting-issues

#1

Hi, I’m sorry if the answer to this question is pretty simple - I’m a Stan novice and a high school student, so I’m still figuring this all out.

I’m trying to generalize Bob Carpenter’s Lotka Volterra case study for three populations and I keep getting the same error, regardless of what I change.

Here’s some information about my computer, if it’s helpful:

  • Operating system: macOS Mojave, 10.4.2
  • RStan Version: 2.18.2
  • Computer: 2017 MacBook Air, 8GB RAM

The error I keep getting:

SAMPLING FOR MODEL 'wolfelkcoyote' NOW (CHAIN 1).
Chain 1: Rejecting initial value:
Chain 1:   Error evaluating the log probability at the initial value.
Chain 1: Exception: lognormal_lpdf: Location parameter[1] is nan, but must be finite!  (in 'model14865b3e324_wolfelkcoyote' at line 53)

Chain 1: Rejecting initial value:
Chain 1:   Error evaluating the log probability at the initial value.
Chain 1: Exception: lognormal_lpdf: Location parameter[1] is nan, but must be finite!  (in 'model14865b3e324_wolfelkcoyote' at line 53)

Chain 1: Unrecoverable error evaluating the log probability at the initial value.
Chain 1: Exception: Max number of iterations exceeded (1000).  (in 'model14865b3e324_wolfelkcoyote' at line 39)

[1] "Error in sampler$call_sampler(args_list[[i]]) : "                                                        
[2] "  Exception: Max number of iterations exceeded (1000).  (in 'model14865b3e324_wolfelkcoyote' at line 39)"
error occurred during calling the sampler; sampling not done

Here’s my Stan code:

functions {
  real[] dz_dt(real t,       // time
           real[] z,     // system state {prey, predator}
           real[] theta, // parameters
           real[] x_r,   // unused data
           int[] x_i) {
    real l = z[1];           // elk (as e isn't a valid variable name)  
    real c = z[2];           // coyotes
    real w = z[3];           // wolves

    real kpo = theta[1];     // this and below are coefficents used in equations
    real pec = theta[2];
    real pew = theta[3];
    real kdo = theta[4];
    real kptw = theta[5];
    real kdtw = theta[6];
    real kdth = theta[7];
    real kpth = theta[8];

    real dl_dt = (kpo - (pec * c) - (pew * w)) * l;   // diff eq for elk pop
    real dc_dt = ((kptw * l) - (kdtw * w) - kdo) * c; // diff eq for coyote pop
    real dw_dt = ((kpth * l) - kdth) * w;             // diff eq for wolf pop

    return { dl_dt, dw_dt, dc_dt };                   // dz_dt gives results of all 3 diff eqs (e, w, c)
  }
}
data {
  int<lower = 0> N;          // number of measurement times
  real ts[N];                // measurement times > 0
  real y_init[3];            // initial measured populations
  real<lower = 0> y[N, 3];   // measured populations
}
parameters {
  real<lower = 0> theta[8];   // { kpo, pec... }
  real<lower = 0> z_init[3];  // initial population
  real<lower = 0> sigma[3];   // measurement errors
}
transformed parameters {
  real z[N, 3]
    = integrate_ode_rk45(dz_dt, z_init, 0, ts, theta,
                     rep_array(0.0, 0), rep_array(0, 0),
                     1e-5, 1e-3, 1e3);
}
model {
  theta[{1,3,5}] ~ normal(1, 0.5);
  theta[{2,4,6,7,8}] ~ normal(0, 0.5);

  sigma ~ lognormal(-1, 1);
  z_init ~ lognormal(10, 1);

  for (k in 1:3) {
    y_init[k] ~ lognormal(log(z_init[k]), sigma[k]);
    y[ , k] ~ lognormal(log(z[, k]), sigma[k]);
  }
}
generated quantities {
  real y_init_rep[3];
  real y_rep[N, 3];

  for (k in 1:3) {
    y_init_rep[k] = lognormal_rng(log(z_init[k]), sigma[k]);
  for (n in 1:N)
    y_rep[n, k] = lognormal_rng(log(z[n, k]), sigma[k]);
  }
}

Here’s my R code:
rstan_options(auto_write=TRUE)

setwd("/Users/username/desktop/rfolder")

elk_wolf_coyote <- read.csv("incorrectFinalProjData.csv", comment.char="#")

N <- length(elk_wolf_coyote$Year) - 1    # number of observations after the first
ts <- 1:N                                # observation times (years)
y_init <- c(elk_wolf_coyote$Elk[1], elk_wolf_coyote$Coyotes[1], elk_wolf_coyote$Wolves[1])    # first observation
y <- as.matrix(elk_wolf_coyote[2:(N+1), 2:4])              # observations after the first
y <- cbind(y[ ,3],y[ ,2], y[ , 1]);                        # wolf, coyote, elk order (REVERSES the order of above line)
elk_wolf_coyote_data <- list(N, ts, y_init, y)

model <- stan_model("wolfelkcoyote.stan")

fit <- sampling(model, data = elk_wolf_coyote_data, seed=123)

print(fit, pars=c("theta", "sigma", "z_init"), 
  probs=c(0.1, 0.5, 0.9), digits = 3)

Here’s the session info for my computer as well:

> sessionInfo()
R version 3.5.1 (2018-07-02)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS  10.14.2

Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
L.APACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] rstan_2.18.2         StanHeaders_2.18.0-1 ggplot2_3.1.0       

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.0         pillar_1.3.0       compiler_3.5.1     plyr_1.8.4         bindr_0.1.1        prettyunits_1.0.2 
 [7] tools_3.5.1        pkgbuild_1.0.2     tibble_1.4.2       gtable_0.2.0       pkgconfig_2.0.2    rlang_0.3.0.1     
[13] cli_1.0.1          rstudioapi_0.8     parallel_3.5.1     yaml_2.2.0         loo_2.0.0          bindrcpp_0.2.2    
[19] gridExtra_2.3      withr_2.1.2        dplyr_0.7.8        stats4_3.5.1       grid_3.5.1         tidyselect_0.2.5  
[25] glue_1.3.0         inline_0.3.15      R6_2.3.0           processx_3.2.1     purrr_0.2.5        callr_3.1.0       
[31] magrittr_1.5       codetools_0.2-15   scales_1.0.0       ps_1.2.1           matrixStats_0.54.0 assertthat_0.2.0  
[37] colorspace_1.3-2   lazyeval_0.2.1     munsell_0.5.0      crayon_1.3.4      

I don’t know how to fix this error. Do any of you have suggestions for possible fixes?

Thank you very much! Again, I apologize if any of my code is silly, I’m still really inexperienced with Stan.

Edit: I originally had a typo (wrote our instead of out somewhere) so fixed it for clarity.


#2

Hi !! I am also a Stan novice and a student who leaves school before graduation !!

Please change your seed =123 to ,e.g., seed =1234 in your code fit <- sampling(model, data = elk_wolf_coyote_data, seed=123)


#3

Thank you!

I switched the seed from 123 to 1234 in my R code. Doing that and running it again gets me this:

SAMPLING FOR MODEL 'wolfelkcoyote' NOW (CHAIN 1).
Chain 1: Rejecting initial value:
Chain 1:   Error evaluating the log probability at the initial value.
Chain 1: Exception: lognormal_lpdf: Location parameter[1] is nan, but must be finite!  (in 'model1289467a0eb6_wolfelkcoyote' at line 53)

Chain 1: Rejecting initial value:
Chain 1:   Error evaluating the log probability at the initial value.
Chain 1: Exception: lognormal_lpdf: Location parameter[3] is nan, but must be finite!  (in 'model1289467a0eb6_wolfelkcoyote' at line 53)

Chain 1: Rejecting initial value:
Chain 1:   Error evaluating the log probability at the initial value.
Chain 1: Exception: lognormal_lpdf: Location parameter[1] is nan, but must be finite!  (in 'model1289467a0eb6_wolfelkcoyote' at line 53)

Chain 1: Rejecting initial value:
Chain 1:   Error evaluating the log probability at the initial value.
Chain 1: Exception: lognormal_lpdf: Location parameter[1] is nan, but must be finite!  (in 'model1289467a0eb6_wolfelkcoyote' at line 53)

Chain 1: Rejecting initial value:
Chain 1:   Error evaluating the log probability at the initial value.
Chain 1: Exception: lognormal_lpdf: Location parameter[1] is nan, but must be finite!  (in 'model1289467a0eb6_wolfelkcoyote' at line 53)

Chain 1: Rejecting initial value:
Chain 1:   Error evaluating the log probability at the initial value.
Chain 1: Exception: lognormal_lpdf: Location parameter[2] is nan, but must be finite!  (in 'model1289467a0eb6_wolfelkcoyote' at line 53)

Chain 1: Rejecting initial value:
Chain 1:   Error evaluating the log probability at the initial value.
Chain 1: Exception: lognormal_lpdf: Location parameter[2] is nan, but must be finite!  (in 'model1289467a0eb6_wolfelkcoyote' at line 53)

Chain 1: 
Chain 1: Gradient evaluation took 0.000492 seconds
Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 4.92 seconds.
Chain 1: Adjust your expectations accordingly!
Chain 1: 
Chain 1: 

After it gave me that, it took a while to progress:

Chain 1: Iteration:    1 / 2000 [  0%]  (Warmup)
Chain 1: Iteration:  200 / 2000 [ 10%]  (Warmup)
Chain 1: Iteration:  400 / 2000 [ 20%]  (Warmup)
Chain 1: Iteration:  600 / 2000 [ 30%]  (Warmup)
Chain 1: Iteration:  800 / 2000 [ 40%]  (Warmup)
Chain 1: Iteration: 1000 / 2000 [ 50%]  (Warmup)
Chain 1: Iteration: 1001 / 2000 [ 50%]  (Sampling)
Chain 1: Iteration: 1200 / 2000 [ 60%]  (Sampling)
Chain 1: Iteration: 1400 / 2000 [ 70%]  (Sampling)
Chain 1: Iteration: 1600 / 2000 [ 80%]  (Sampling)
Chain 1: Iteration: 1800 / 2000 [ 90%]  (Sampling)
Chain 1: Iteration: 2000 / 2000 [100%]  (Sampling)
Chain 1: 
Chain 1:  Elapsed Time: 529.591 seconds (Warm-up)
Chain 1:                540.573 seconds (Sampling)
Chain 1:                1070.16 seconds (Total)
Chain 1: 

SAMPLING FOR MODEL 'wolfelkcoyote' NOW (CHAIN 2).
Chain 2: Unrecoverable error evaluating the log probability at the initial value.
Chain 2: Exception: Max number of iterations exceeded (1000).  (in 'model1289467a0eb6_wolfelkcoyote' at line 39)

[1] "Error in sampler$call_sampler(args_list[[i]]) : "                                                         
[2] "  Exception: Max number of iterations exceeded (1000).  (in 'model1289467a0eb6_wolfelkcoyote' at line 39)"
error occurred during calling the sampler; sampling not done

Are these problems likely due to the data I’m working with or something with my code? If necessary, I can post the data - I just originally thought it had to be my code itself.


#4

Please try other seeds, and if it cannot fix the error, then … what we have to do next is check your stan file.


#5

Trying 12345 as the seed gives me this result:

SAMPLING FOR MODEL 'wolfelkcoyote' NOW (CHAIN 1).
Chain 1: Rejecting initial value:
Chain 1:   Error evaluating the log probability at the initial value.
Chain 1: Exception: lognormal_lpdf: Location parameter[2] is nan, but must be finite!  (in 'model1289467a0eb6_wolfelkcoyote' at line 53)

Chain 1: Rejecting initial value:
Chain 1:   Error evaluating the log probability at the initial value.
Chain 1: Exception: lognormal_lpdf: Location parameter[3] is nan, but must be finite!  (in 'model1289467a0eb6_wolfelkcoyote' at line 53)

Chain 1: Rejecting initial value:
Chain 1:   Error evaluating the log probability at the initial value.
Chain 1: Exception: lognormal_lpdf: Location parameter[1] is nan, but must be finite!  (in 'model1289467a0eb6_wolfelkcoyote' at line 53)

Chain 1: Rejecting initial value:
Chain 1:   Error evaluating the log probability at the initial value.
Chain 1: Exception: lognormal_lpdf: Location parameter[1] is nan, but must be finite!  (in 'model1289467a0eb6_wolfelkcoyote' at line 53)

Chain 1: Unrecoverable error evaluating the log probability at the initial value.
Chain 1: Exception: Max number of iterations exceeded (1000).  (in 'model1289467a0eb6_wolfelkcoyote' at line 39)

[1] "Error in sampler$call_sampler(args_list[[i]]) : "                                                         
[2] "  Exception: Max number of iterations exceeded (1000).  (in 'model1289467a0eb6_wolfelkcoyote' at line 39)"
error occurred during calling the sampler; sampling not done

Edit: I tried some other seeds just to see (1243, 124, 12, 1000) and got that output each time.


#7

What parameters should I restrict with <lower=0> ?


#8

I am sorry, I miss understand in the previous comments. When I built my original model, I also encountered the same errors. Then I add ,e.g., <lower= ,upper=> to ensure the compatibility with model and parameters. I am sorry I cannot fix it…


#9

It’s alright, it’s not your fault my code has errors right now.


#11

Oh!! I overlooked that there is a chain from your output. You already get one chain !! Congratulation !! So, what you need is only reduce your generating chains to only one chain !! Then you get estimates !! I think the default of the parameter chains is 4. So, what you need is to set the variable chains=1 with seed= 1234. So, good night !! That is, … you should extract the following from your above result.

Chain 1: Iteration:    1 / 2000 [  0%]  (Warmup)
Chain 1: Iteration:  200 / 2000 [ 10%]  (Warmup)
Chain 1: Iteration:  400 / 2000 [ 20%]  (Warmup)
Chain 1: Iteration:  600 / 2000 [ 30%]  (Warmup)
Chain 1: Iteration:  800 / 2000 [ 40%]  (Warmup)
Chain 1: Iteration: 1000 / 2000 [ 50%]  (Warmup)
Chain 1: Iteration: 1001 / 2000 [ 50%]  (Sampling)
Chain 1: Iteration: 1200 / 2000 [ 60%]  (Sampling)
Chain 1: Iteration: 1400 / 2000 [ 70%]  (Sampling)
Chain 1: Iteration: 1600 / 2000 [ 80%]  (Sampling)
Chain 1: Iteration: 1800 / 2000 [ 90%]  (Sampling)
Chain 1: Iteration: 2000 / 2000 [100%]  (Sampling)
Chain 1: 
Chain 1:  Elapsed Time: 529.591 seconds (Warm-up)
Chain 1:                540.573 seconds (Sampling)
Chain 1:                1070.16 seconds (Total)
Chain 1:

#12

first off, I’m really excited to see that high school students using Stan - congratulations to both of you!

however, I’m sorry to say that the advice here is wrong - it is absolutely not OK to find a random seed which “works” and then use the output from a single chain to do estimation - you’re ignoring every sign, and there are lots of signs here, that your model is ill-specified with respect to the data.

here are some tutorials and case studies that you should read:

https://mc-stan.org/users/documentation/case-studies/rstan_workflow.html