I don’t see anything particularly strange when doing this. As I said before, max(y_hat)
is not that high
y_hat ← as.data.frame(fit, pars=“y_hat”)
max(y_hat)
[1] 7.514862
I don’t see anything particularly strange when doing this. As I said before, max(y_hat)
is not that high
y_hat ← as.data.frame(fit, pars=“y_hat”)
max(y_hat)
[1] 7.514862
Can you check warm-up iterations too?
I’m not sure how to check the warm-up iterations. I tried this:
fit <- sampling(model_poisson, data = practices_dat,
chains = 1, show_messages = T,
cores=1,
control = list(max_treedepth = 18),
iter = iter,
warmup = warmup,
save_warmup=TRUE)
y_hat <- as.data.frame(extract(fit, pars="y_hat", inc_warmup = TRUE))
max(y_hat)
[1] 7.505011
Hmm, that should include them. Can you post an R script to run this example? Seems like something fishy is happening…
Sadly, this is one of those datasets that I cannot share :_( . I can definitely share the code that I’m running, in fact, that already in this thread. If it is helpful, I might be able to share a file with summary(fit)
but I need to ask for permission first.
In case it helps I tried using y_hat
to draw from a Poisson with R instead of stan:
library(rstan)
library(bayesplot)
fit <- sampling(model_poisson, data = practices_dat,
chains = 1, show_messages = T,
cores=1,
control = list(max_treedepth = 18),
iter = iter,
warmup = warmup,
save_warmup=TRUE)
y_hat <- as.data.frame(fit, pars="y_hat")
poisson_rng <- function(lambda){
lambda <- y_hat[,lambda]
y <- sapply(exp(lambda), rpois,n=1)
return(y)
}
y <- sapply(colnames(y_hat), poisson_rng)
y <- as.data.frame(y)
ppc_dens_overlay(y = bsdata$y,
yrep = as.matrix(y[sample(x = 1:nrow(y), size = 50),]))
It just seems impossible that y_hat is always less than 7.5 when you don’t have the poisson_log_rng in there, but then when you add the rng, the number gets bigger :/. Seems like a bug in Stan, and since I was the last person to touch the _rngs I’m a little unnerved :D.
Let me know if there is anything that I could do to help.
At this point we’d be looking to build a minimum reproducible example, with like generated data and stuff. I’ll try a couple things in the afternoon if I get a chance, but if you wanted to look at making one yourself that’d be super helpful.
I’m not sure if this is relevant, but just in case. I started playing with the loo
package to try to compare different models. I added the following code to my generated quantities block:
generated quantities {
//vector[I * T] y_sim;
vector[I * T] log_lik;
vector[I * T] y_hat = log(n) + x_beta + gamma[index_time] + a[practice] + b[practice] .* time;
// for (it in 1:I * T) {
// y_sim[it] = poisson_log_rng(y_hat[it]);
// }
for(it in 1:I*T){
log_lik[it] = poisson_lpmf(y[it] | log(n[it]) + x_beta[it] + gamma[index_time[it]] + a[practice[it]] + b[practice[it]] .* time[it]);
}
}
I’m not sure if this is the correct way of implementing this or not, but once I added that code this error came back:
Stan model 'stan-1b301a6a38d0' does not contain samples.
That should be on log level: poisson_log_lmpf.
the _log
in the suffix describes the scale of the parameter, whereas the l
in _lpdf
indicates the return will be on the log scale.
Thanks @andre.pfeuffer, changing to poisson_log_lmpf
made that problem go away. Which I guess indicates that the problem is in poisson_log_rng
I work with @ignacio and have access to the data and can verify that this is happening. Let me know what you think! Thanks for taking a look. :)
I had a look at this and didn’t get any unexpected behavior. Had a quick look at the C++ of the first vs. the second model and didn’t find anything weird either.
Any chance you could try this in the latest Cmdstan and see if the error exists there? If you’ve never used it before, clone it from git:
git clone --recursive https://github.com/stan-dev/cmdstan.git
cd into the directory and then make your model:
cd cmdstan
cp ../path/to/model.stan ./model.stan
make model
Notice on the make model
you leave off the .stan. Run with something like:
./model sample data file=mydata.Rdata
You can generate mydata.Rdata for cmdstan from R with something like
stan_rdump(data, file = "mydata.Rdata")
Also, in Cmdstan at least, generated quantities doesn’t throw errors in warmup (https://github.com/stan-dev/stan/issues/2355#issuecomment-315166499 – I double checked this is true).
I’m struggling with stan_rdump
> stan_rdump(practices_dat, file = "mydata.Rdata")
Error in FUN(X[[i]], ...) : invalid first argument
where practices_dat
is the list that I pass to the sampling
function in R. That is:
fit2 <- sampling(poisson_bug, data = practices_dat,
chains = 4, show_messages = T,
cores= 4,
control = list(max_treedepth = 18),
iter = 3000,
warmup = 1500)
Hmm, is practices_dat a list? Check the stan_rdump doc: ?stan_rdump
. I think this should work :/.
edit:
Something like:
practices_data = list(X = c(1, 2, 3), N = 3)
Yes, practices_dat
is a list. I looked at ?stan_rdump
and it looks like i should not use a list. It also looks like i should use .R
instead of .Rdata
(I tried `.Rdata and it did not work)
library(rstan)
# This does not work ------------------------------------------------------
practices_data = list(X = c(1, 2, 3), N = 3)
stan_rdump(practices_data, file = "mydata.Rdata")
# This works --------------------------------------------------------------
X = c(1, 2, 3); N = 3
stan_rdump(c('X', 'N'), file = "mydata.R")
So, I endup up running the following to create my data file for cmdstan
stan_rdump(c('I', 'T', 'y', 'n', 'K', 'x', 'practice', 'time', 'index_time'), file = "mydata.R")
Now i’m having troubles compiling the model.
$ make model
make/models:13: target `model' doesn't match the target pattern
--- Linking C++ model ---
g++ -Wall -I . -isystem stan/lib/stan_math/lib/eigen_3.3.3 -isystem stan/lib/stan_math/lib/boost_1.66.0 -isystem stan/lib/stan_math/lib/cvodes-3.1.0/include -std=c++1y -DBOOST_RESULT_OF_USE_TR1 -DBOOST_NO_DECLTYPE -DBOOST_DISABLE_ASSERTS -DBOOST_PHOENIX_NO_VARIADIC_EXPRESSION -Wno-unused-function -Wno-uninitialized -I src -isystem stan/src -isystem stan/lib/stan_math/ -DFUSION_MAX_VECTOR_SIZE=12 -Wno-unused-local-typedefs -DEIGEN_NO_DEBUG -DNO_FPRINTF_OUTPUT -pipe -O3 -o model src/cmdstan/main.cpp -include stan/lib/stan_math/lib/cvodes-3.1.0/lib/libsundials_nvecserial.a stan/lib/stan_math/lib/cvodes-3.1.0/lib/libsundials_cvodes.a
g++.exe: error: stan/lib/stan_math/lib/cvodes-3.1.0/lib/libsundials_cvodes.a: No such file or directory
make: *** [model] Error 1
Oh whoops, my apologies. That’s on me haha. That’s what I get for not double checking.
$ make model
make/models:13: target `model' doesn't match the target pattern
Is model.stan in the cmdstan folder? The error makes it sound like make isn’t finding the file. Like:
bbales2@frog:~/cmdstan-deletethis$ ls
bin Jenkinsfile make model.stan ...
If it’s in another directory, you can do:
make ../path/to/the/model
But then the binary goes there I think.
It is:
MATHEMATICA+IMartinez@M116 MINGW64 /c/Users/IMartinez/Documents/BayesianSynthesis/deviants/bug/cmdstan (develop)
$ ls
examples/ Jenkinsfile LICENSE make/ makefile model.stan mydata.R README.md runCmdStanTests.py* src/ stan/ test-all.sh*
in case is relevant, I’m running all this stuff on a Windows computer (I can do it on windows server 2008 or windows 10…)
Oh no, haha. I use Linux here, turns out the cmd is different for WIndows:
Try:
make model.exe
(from https://github.com/stan-dev/cmdstan/wiki/Getting-Started-with-CmdStan)