I’m running into some weird behaviour when fitting a relatively simple truncated normal distribution to a large dataset (n = 87154). I wonder whether this is related to issue #2375 on the stan-dev GitHub (https://github.com/stan-dev/stan/issues/2375), because of the zero lower bound.

Here is a stripped down model:

`foo_model <- " data{ int<lower=0> Nsites; int<lower=0> N; real<lower=0> y[N]; int site[N]; int month[N]; int day[N]; } parameters{ real betaINT[Nsites]; matrix[Nsites,12] betaSITEMTH; matrix[Nsites,366] betaSITEDAY; real<lower=0> varcomp1[Nsites]; real<lower=0> varcomp2[Nsites]; real<lower=0> varcomp3[Nsites]; real<lower=0> int_sd; real int_mu; } transformed parameters{ real mu[N]; for (i in 1:N) mu[i] = int_mu + int_sd*betaINT[site[i]] + betaSITEDAY[site[i],day[i]]*varcomp1[site[i]] + betaSITEMTH[site[i],month[i]]*varcomp2[site[i]]; } model{ varcomp1 ~ normal(0,1); varcomp2 ~ normal(0,1); varcomp3 ~ normal(0,1); int_mu ~ normal(0,1); int_sd ~ normal(0,1); betaINT ~ normal(0,1); to_vector(betaSITEMTH) ~ normal(0,1); to_vector(betaSITEDAY) ~ normal(0,1); for (i in 1:N) y[i] ~ normal(mu[i], varcomp3[site[i]]) T[0,]; } "`

I can fit the above model with the ‘betaSITEDAY’ term in mu commented out, or, with the betaSITEDAY term but without the truncation and lower bound on y. However, if the model is fitted as shown above, stepsize__ goes very small, e.g. 1E-9, and the chains effectively get stuck within about 30 iterations.

I have tried changing initials and increasing adapt_delta = 0.9999 and max_treedepth = 15. The latter changes might work - but they slow everything down far too much to be practical because I’m hitting the max_treedepth.

Is there something obvious that I’m missing? The only additional solution that I could think of was to turn off adaptation and fix the stepsize to something from before it goes small, but this seems like a bad idea… I’m on RStan 2.16.2 in R 3.4.2. Data are available here.

Thanks for any thoughts you can offer.

Andrew