Let me know if there is anything that I could do to help.

At this point we’d be looking to build a minimum reproducible example, with like generated data and stuff. I’ll try a couple things in the afternoon if I get a chance, but if you wanted to look at making one yourself that’d be super helpful.

I’m not sure if this is relevant, but just in case. I started playing with the `loo`

package to try to compare different models. I added the following code to my generated quantities block:

```
generated quantities {
//vector[I * T] y_sim;
vector[I * T] log_lik;
vector[I * T] y_hat = log(n) + x_beta + gamma[index_time] + a[practice] + b[practice] .* time;
// for (it in 1:I * T) {
// y_sim[it] = poisson_log_rng(y_hat[it]);
// }
for(it in 1:I*T){
log_lik[it] = poisson_lpmf(y[it] | log(n[it]) + x_beta[it] + gamma[index_time[it]] + a[practice[it]] + b[practice[it]] .* time[it]);
}
}
```

I’m not sure if this is the correct way of implementing this or not, but once I added that code this error came back:

```
Stan model 'stan-1b301a6a38d0' does not contain samples.
```

That should be on log level: poisson_log_lmpf.

the `_log`

in the suffix describes the scale of the parameter, whereas the `l`

in `_lpdf`

indicates the return will be on the log scale.

Thanks @andre.pfeuffer, changing to `poisson_log_lmpf`

made that problem go away. Which I guess indicates that the problem is in `poisson_log_rng`

I work with @ignacio and have access to the data and can verify that this is happening. Let me know what you think! Thanks for taking a look. :)

I had a look at this and didn’t get any unexpected behavior. Had a quick look at the C++ of the first vs. the second model and didn’t find anything weird either.

Any chance you could try this in the latest Cmdstan and see if the error exists there? If you’ve never used it before, clone it from git:

```
git clone --recursive https://github.com/stan-dev/cmdstan.git
```

cd into the directory and then make your model:

```
cd cmdstan
cp ../path/to/model.stan ./model.stan
make model
```

Notice on the `make model`

you leave off the .stan. Run with something like:

```
./model sample data file=mydata.Rdata
```

You can generate mydata.Rdata for cmdstan from R with something like

```
stan_rdump(data, file = "mydata.Rdata")
```

Also, in Cmdstan at least, generated quantities doesn’t throw errors in warmup (https://github.com/stan-dev/stan/issues/2355#issuecomment-315166499 – I double checked this is true).

I’m struggling with `stan_rdump`

```
> stan_rdump(practices_dat, file = "mydata.Rdata")
Error in FUN(X[[i]], ...) : invalid first argument
```

where `practices_dat`

is the list that I pass to the `sampling`

function in R. That is:

```
fit2 <- sampling(poisson_bug, data = practices_dat,
chains = 4, show_messages = T,
cores= 4,
control = list(max_treedepth = 18),
iter = 3000,
warmup = 1500)
```

Hmm, is practices_dat a list? Check the stan_rdump doc: `?stan_rdump`

. I think this should work :/.

edit:

Something like:

```
practices_data = list(X = c(1, 2, 3), N = 3)
```

Yes, `practices_dat`

is a list. I looked at `?stan_rdump`

and it looks like i should not use a list. It also looks like i should use `.R`

instead of `.Rdata`

(I tried `.Rdata and it did not work)

```
library(rstan)
# This does not work ------------------------------------------------------
practices_data = list(X = c(1, 2, 3), N = 3)
stan_rdump(practices_data, file = "mydata.Rdata")
# This works --------------------------------------------------------------
X = c(1, 2, 3); N = 3
stan_rdump(c('X', 'N'), file = "mydata.R")
```

So, I endup up running the following to create my data file for `cmdstan`

`stan_rdump(c('I', 'T', 'y', 'n', 'K', 'x', 'practice', 'time', 'index_time'), file = "mydata.R")`

Now i’m having troubles compiling the model.

```
$ make model
make/models:13: target `model' doesn't match the target pattern
--- Linking C++ model ---
g++ -Wall -I . -isystem stan/lib/stan_math/lib/eigen_3.3.3 -isystem stan/lib/stan_math/lib/boost_1.66.0 -isystem stan/lib/stan_math/lib/cvodes-3.1.0/include -std=c++1y -DBOOST_RESULT_OF_USE_TR1 -DBOOST_NO_DECLTYPE -DBOOST_DISABLE_ASSERTS -DBOOST_PHOENIX_NO_VARIADIC_EXPRESSION -Wno-unused-function -Wno-uninitialized -I src -isystem stan/src -isystem stan/lib/stan_math/ -DFUSION_MAX_VECTOR_SIZE=12 -Wno-unused-local-typedefs -DEIGEN_NO_DEBUG -DNO_FPRINTF_OUTPUT -pipe -O3 -o model src/cmdstan/main.cpp -include stan/lib/stan_math/lib/cvodes-3.1.0/lib/libsundials_nvecserial.a stan/lib/stan_math/lib/cvodes-3.1.0/lib/libsundials_cvodes.a
g++.exe: error: stan/lib/stan_math/lib/cvodes-3.1.0/lib/libsundials_cvodes.a: No such file or directory
make: *** [model] Error 1
```

Oh whoops, my apologies. That’s on me haha. That’s what I get for not double checking.

```
$ make model
make/models:13: target `model' doesn't match the target pattern
```

Is model.stan in the cmdstan folder? The error makes it sound like make isn’t finding the file. Like:

```
bbales2@frog:~/cmdstan-deletethis$ ls
bin Jenkinsfile make model.stan ...
```

If it’s in another directory, you can do:

```
make ../path/to/the/model
```

But then the binary goes there I think.

It is:

```
MATHEMATICA+IMartinez@M116 MINGW64 /c/Users/IMartinez/Documents/BayesianSynthesis/deviants/bug/cmdstan (develop)
$ ls
examples/ Jenkinsfile LICENSE make/ makefile model.stan mydata.R README.md runCmdStanTests.py* src/ stan/ test-all.sh*
```

in case is relevant, I’m running all this stuff on a Windows computer (I can do it on windows server 2008 or windows 10…)

Oh no, haha. I use Linux here, turns out the cmd is different for WIndows:

Try:

`make model.exe`

(from https://github.com/stan-dev/cmdstan/wiki/Getting-Started-with-CmdStan)

Very jealous…

```
MATHEMATICA+IMartinez@M116 MINGW64 /c/Users/IMartinez/Documents/BayesianSynthesis/deviants/bug/cmdstan (develop)
$ make model.exe
g++ -Wall -I . -isystem stan/lib/stan_math/lib/eigen_3.3.3 -isystem stan/lib/stan_math/lib/boost_1.66.0 -isystem stan/lib/stan_math/lib/cvodes-3.1.0/include -std=c++1y -DBOOST_RESULT_OF_USE_TR1 -DBOOST_NO_DECLTYPE -DBOOST_DISABLE_ASSERTS -DBOOST_PHOENIX_NO_VARIADIC_EXPRESSION -Wno-unused-function -Wno-uninitialized -I src -isystem stan/src -isystem stan/lib/stan_math/ -DFUSION_MAX_VECTOR_SIZE=12 -Wno-unused-local-typedefs -DEIGEN_NO_DEBUG -DNO_FPRINTF_OUTPUT -pipe -c -O3 -o bin/cmdstan/stanc.o src/cmdstan/stanc.cpp
cc1plus.exe: error: unrecognized command line option '-std=c++1y'
cc1plus.exe: warning: unrecognized command line option "-Wno-unused-local-typedefs" [enabled by default]
make: *** [bin/cmdstan/stanc.o] Error 1
```

Yegads, well, if you want to figure out how to fix that C++ there’s a very small chance that we’d see an interesting behavior :P. Long shot, but do you have the latest Rstan?

I’m honestly stumped on this. Without the specific data/model causing the problem, it’s hard to do anything other than make undirected guesses.

If it’s absolutely commenting and uncommenting this that causes the problem:

```
for (it in 1:I * T) {
y_sim[it] = poisson_log_rng(y_hat[it]);
}
```

then I’m stumped! It doesn’t look like you’re using any fixed seeds or anything?

(The c++ error sounds like your g++ is too old, but I dunno how to upgrade that in Windows)

That is what causes the problem. I know is weird, that is why I asked @shira to check.

My hope is that next week I will be able to share some code and data to reproduce this problem. Thanks a lot for all your help @bbbales2!

In another thread I wrote Recommendations for what to do when k exceeds 0.5 in the loo package?

that I think this is a problem with initialization. I copy that comment here and add a bit more in the end.

see `?stan`

and option `seed`

. Try runinng `stan(...., seed=1337)`

with different seed values.

see `?stan`

and option `init`

. The default is to randomly generate initial values between -2 and 2 on the unconstrained support. If you transform -2 and 2 from unconstrained to constrained space, are these initial values sensible?

If you are unlucky initial values are all 2, and if n=1 and x=1, then

`log(n) + x*2 + 2 + exp(2)*2 + epx(2)*2*4`

is about 79 which when exponentiated to Poisson parameter is about 2e34, which is a lot. You probably need to initialize log(a_std) and log(b_std) with much smaller range than [-2, 2] or scale them in Stan code by dividing them with a big number. For example, try dividing ‘a’ and ‘b’ by 100. Then maximum Poisson parameter value (with n=1 and x=1) is about 311, which should work fine.

Since initial values are randomly chosen, one chain can work, but running several can fail. Adding _rng to generated quantities changes the behavior as _rng is called between sampling iterations and it changes the state of the random number generator.

I recommend to test with different seeds, and with more specific inits or scaled a and b. For checking the effect of reasnoable inits/scaling ,you could add a print statement in Stan code which would print the value of

`log(n) + x_beta + gamma[index_time] + a[practice] + b[practice] .* time`

I’ve seen this same problem before (I just didn’t realize that this one is likely to be the same), and the common thing is a distribution with positive constrained parameter which is a function of positive constrained parameters so that effectively there is something like `exp(a + b*exp(c))`

and if `b`

and `c`

are both initialized to be close to 2, then that expression has very high value.

Here are the data and code that can reproduce this problem.

Doing this does indeed solve the problem:

```
transformed parameters {
vector[I] a;
vector[I] b;
a = (sigma_a * a_std)/100 ; // Matt trick
b = (sigma_b * b_std)/100 ; // Matt trick
}
```

Thanks everyone!