Help with Poisson model

Very jealous…

MATHEMATICA+IMartinez@M116 MINGW64 /c/Users/IMartinez/Documents/BayesianSynthesis/deviants/bug/cmdstan (develop)
$ make model.exe
g++ -Wall -I . -isystem stan/lib/stan_math/lib/eigen_3.3.3 -isystem stan/lib/stan_math/lib/boost_1.66.0 -isystem stan/lib/stan_math/lib/cvodes-3.1.0/include -std=c++1y -DBOOST_RESULT_OF_USE_TR1 -DBOOST_NO_DECLTYPE -DBOOST_DISABLE_ASSERTS -DBOOST_PHOENIX_NO_VARIADIC_EXPRESSION -Wno-unused-function -Wno-uninitialized -I src -isystem stan/src -isystem stan/lib/stan_math/ -DFUSION_MAX_VECTOR_SIZE=12 -Wno-unused-local-typedefs -DEIGEN_NO_DEBUG -DNO_FPRINTF_OUTPUT -pipe  -c -O3 -o bin/cmdstan/stanc.o src/cmdstan/stanc.cpp
cc1plus.exe: error: unrecognized command line option '-std=c++1y'
cc1plus.exe: warning: unrecognized command line option "-Wno-unused-local-typedefs" [enabled by default]
make: *** [bin/cmdstan/stanc.o] Error 1

Yegads, well, if you want to figure out how to fix that C++ there’s a very small chance that we’d see an interesting behavior :P. Long shot, but do you have the latest Rstan?

I’m honestly stumped on this. Without the specific data/model causing the problem, it’s hard to do anything other than make undirected guesses.

If it’s absolutely commenting and uncommenting this that causes the problem:

for (it in 1:I * T) {
  y_sim[it] = poisson_log_rng(y_hat[it]);
}

then I’m stumped! It doesn’t look like you’re using any fixed seeds or anything?

(The c++ error sounds like your g++ is too old, but I dunno how to upgrade that in Windows)

That is what causes the problem. I know is weird, that is why I asked @shira to check.

My hope is that next week I will be able to share some code and data to reproduce this problem. Thanks a lot for all your help @bbbales2!

In another thread I wrote Recommendations for what to do when k exceeds 0.5 in the loo package? - #11 by avehtari
that I think this is a problem with initialization. I copy that comment here and add a bit more in the end.

see ?stan and option seed. Try runinng stan(...., seed=1337) with different seed values.

see ?stan and option init. The default is to randomly generate initial values between -2 and 2 on the unconstrained support. If you transform -2 and 2 from unconstrained to constrained space, are these initial values sensible?

If you are unlucky initial values are all 2, and if n=1 and x=1, then
log(n) + x*2 + 2 + exp(2)*2 + epx(2)*2*4
is about 79 which when exponentiated to Poisson parameter is about 2e34, which is a lot. You probably need to initialize log(a_std) and log(b_std) with much smaller range than [-2, 2] or scale them in Stan code by dividing them with a big number. For example, try dividing ‘a’ and ‘b’ by 100. Then maximum Poisson parameter value (with n=1 and x=1) is about 311, which should work fine.

Since initial values are randomly chosen, one chain can work, but running several can fail. Adding _rng to generated quantities changes the behavior as _rng is called between sampling iterations and it changes the state of the random number generator.

I recommend to test with different seeds, and with more specific inits or scaled a and b. For checking the effect of reasnoable inits/scaling ,you could add a print statement in Stan code which would print the value of
log(n) + x_beta + gamma[index_time] + a[practice] + b[practice] .* time

I’ve seen this same problem before (I just didn’t realize that this one is likely to be the same), and the common thing is a distribution with positive constrained parameter which is a function of positive constrained parameters so that effectively there is something like exp(a + b*exp(c)) and if b and c are both initialized to be close to 2, then that expression has very high value.

Here are the data and code that can reproduce this problem.

Doing this does indeed solve the problem:

transformed parameters {
  vector[I] a; 
  vector[I] b; 
  a = (sigma_a * a_std)/100 ;            // Matt trick
  b = (sigma_b * b_std)/100 ;            // Matt trick
}

Thanks everyone!