Help with Poisson model

ignacio · February 25, 2018, 10:45am

I don’t see anything particularly strange when doing this. As I said before, max(y_hat) is not that high

y_hat ← as.data.frame(fit, pars=“y_hat”)
max(y_hat)
[1] 7.514862

syclik · February 25, 2018, 1:02pm

Can you check warm-up iterations too?

ignacio · February 25, 2018, 3:06pm

I’m not sure how to check the warm-up iterations. I tried this:

fit <- sampling(model_poisson, data = practices_dat,
                chains = 1, show_messages = T,
                cores=1,
                control = list(max_treedepth = 18),
                iter = iter,
                warmup = warmup,
                save_warmup=TRUE)
y_hat <- as.data.frame(extract(fit, pars="y_hat", inc_warmup = TRUE))
max(y_hat)
[1] 7.505011

bbbales2 · February 25, 2018, 3:23pm

Hmm, that should include them. Can you post an R script to run this example? Seems like something fishy is happening…

ignacio · February 25, 2018, 4:34pm

Sadly, this is one of those datasets that I cannot share :_( . I can definitely share the code that I’m running, in fact, that already in this thread. If it is helpful, I might be able to share a file with summary(fit) but I need to ask for permission first.

In case it helps I tried using y_hat to draw from a Poisson with R instead of stan:

library(rstan)
library(bayesplot)
fit <- sampling(model_poisson, data = practices_dat,
                chains = 1, show_messages = T,
                cores=1,
                control = list(max_treedepth = 18),
                iter = iter,
                warmup = warmup,
                save_warmup=TRUE)
y_hat <- as.data.frame(fit, pars="y_hat") 
poisson_rng <- function(lambda){
  lambda <- y_hat[,lambda]
  y <- sapply(exp(lambda), rpois,n=1)
  return(y)
}

y <- sapply(colnames(y_hat), poisson_rng)
y <- as.data.frame(y)

ppc_dens_overlay(y = bsdata$y,
                 yrep = as.matrix(y[sample(x = 1:nrow(y), size = 50),]))

bbbales2 · February 25, 2018, 4:41pm

It just seems impossible that y_hat is always less than 7.5 when you don’t have the poisson_log_rng in there, but then when you add the rng, the number gets bigger :/. Seems like a bug in Stan, and since I was the last person to touch the _rngs I’m a little unnerved :D.

ignacio · February 25, 2018, 4:49pm

Let me know if there is anything that I could do to help.

bbbales2 · February 25, 2018, 5:08pm

At this point we’d be looking to build a minimum reproducible example, with like generated data and stuff. I’ll try a couple things in the afternoon if I get a chance, but if you wanted to look at making one yourself that’d be super helpful.

ignacio · February 26, 2018, 4:55pm

I’m not sure if this is relevant, but just in case. I started playing with the loo package to try to compare different models. I added the following code to my generated quantities block:

generated quantities {
  //vector[I * T] y_sim;
  vector[I * T] log_lik;
  vector[I * T] y_hat = log(n) + x_beta + gamma[index_time] + a[practice] + b[practice] .* time;
 // for (it in 1:I * T) {
  //  y_sim[it] = poisson_log_rng(y_hat[it]);
 // }
 for(it in 1:I*T){
    log_lik[it] = poisson_lpmf(y[it] | log(n[it]) + x_beta[it] + gamma[index_time[it]] + a[practice[it]] + b[practice[it]] .* time[it]);
 }
}

I’m not sure if this is the correct way of implementing this or not, but once I added that code this error came back:

Stan model 'stan-1b301a6a38d0' does not contain samples.

andre.pfeuffer · February 26, 2018, 11:06pm

That should be on log level: poisson_log_lmpf.

Bob_Carpenter · February 27, 2018, 6:58am

the _log in the suffix describes the scale of the parameter, whereas the l in _lpdf indicates the return will be on the log scale.

ignacio · February 27, 2018, 2:15pm

Thanks @andre.pfeuffer, changing to poisson_log_lmpf made that problem go away. Which I guess indicates that the problem is in poisson_log_rng

shira · March 1, 2018, 4:16pm

I work with @ignacio and have access to the data and can verify that this is happening. Let me know what you think! Thanks for taking a look. :)

bbbales2 · March 1, 2018, 6:16pm

I had a look at this and didn’t get any unexpected behavior. Had a quick look at the C++ of the first vs. the second model and didn’t find anything weird either.

Any chance you could try this in the latest Cmdstan and see if the error exists there? If you’ve never used it before, clone it from git:

git clone --recursive https://github.com/stan-dev/cmdstan.git

cd into the directory and then make your model:

cd cmdstan
cp ../path/to/model.stan ./model.stan
make model

Notice on the make model you leave off the .stan. Run with something like:

./model sample data file=mydata.Rdata

You can generate mydata.Rdata for cmdstan from R with something like

stan_rdump(data, file = "mydata.Rdata")

Also, in Cmdstan at least, generated quantities doesn’t throw errors in warmup (https://github.com/stan-dev/stan/issues/2355#issuecomment-315166499 – I double checked this is true).

ignacio · March 1, 2018, 6:47pm

I’m struggling with stan_rdump

> stan_rdump(practices_dat, file = "mydata.Rdata")
Error in FUN(X[[i]], ...) : invalid first argument

where practices_dat is the list that I pass to the sampling function in R. That is:

fit2 <- sampling(poisson_bug, data = practices_dat,
                 chains = 4, show_messages = T,
                 cores= 4,
                 control = list(max_treedepth = 18),
                 iter = 3000,
                 warmup = 1500)

bbbales2 · March 1, 2018, 6:50pm

Hmm, is practices_dat a list? Check the stan_rdump doc: ?stan_rdump. I think this should work :/.

edit:

Something like:

practices_data = list(X = c(1, 2, 3), N = 3)

ignacio · March 1, 2018, 7:43pm

Yes, practices_dat is a list. I looked at ?stan_rdump and it looks like i should not use a list. It also looks like i should use .R instead of .Rdata (I tried `.Rdata and it did not work)

library(rstan)

# This does not work ------------------------------------------------------
practices_data = list(X = c(1, 2, 3), N = 3)
stan_rdump(practices_data, file = "mydata.Rdata")


# This works --------------------------------------------------------------
X = c(1, 2, 3); N = 3
stan_rdump(c('X', 'N'), file = "mydata.R")

So, I endup up running the following to create my data file for cmdstan

stan_rdump(c('I', 'T', 'y', 'n', 'K', 'x', 'practice', 'time', 'index_time'), file = "mydata.R")

Now i’m having troubles compiling the model.

$ make model
make/models:13: target `model' doesn't match the target pattern

--- Linking C++ model ---
g++ -Wall -I . -isystem stan/lib/stan_math/lib/eigen_3.3.3 -isystem stan/lib/stan_math/lib/boost_1.66.0 -isystem stan/lib/stan_math/lib/cvodes-3.1.0/include -std=c++1y -DBOOST_RESULT_OF_USE_TR1 -DBOOST_NO_DECLTYPE -DBOOST_DISABLE_ASSERTS -DBOOST_PHOENIX_NO_VARIADIC_EXPRESSION -Wno-unused-function -Wno-uninitialized -I src -isystem stan/src -isystem stan/lib/stan_math/ -DFUSION_MAX_VECTOR_SIZE=12 -Wno-unused-local-typedefs -DEIGEN_NO_DEBUG -DNO_FPRINTF_OUTPUT -pipe    -O3 -o model src/cmdstan/main.cpp -include  stan/lib/stan_math/lib/cvodes-3.1.0/lib/libsundials_nvecserial.a stan/lib/stan_math/lib/cvodes-3.1.0/lib/libsundials_cvodes.a
g++.exe: error: stan/lib/stan_math/lib/cvodes-3.1.0/lib/libsundials_cvodes.a: No such file or directory
make: *** [model] Error 1

bbbales2 · March 1, 2018, 7:47pm

Oh whoops, my apologies. That’s on me haha. That’s what I get for not double checking.

$ make model
make/models:13: target `model' doesn't match the target pattern

Is model.stan in the cmdstan folder? The error makes it sound like make isn’t finding the file. Like:

bbales2@frog:~/cmdstan-deletethis$ ls
bin       Jenkinsfile  make      model.stan ...

If it’s in another directory, you can do:

make ../path/to/the/model

But then the binary goes there I think.

ignacio · March 1, 2018, 7:51pm

It is:

MATHEMATICA+IMartinez@M116 MINGW64 /c/Users/IMartinez/Documents/BayesianSynthesis/deviants/bug/cmdstan (develop)
$ ls
examples/  Jenkinsfile  LICENSE  make/  makefile  model.stan  mydata.R  README.md  runCmdStanTests.py*  src/  stan/  test-all.sh*

in case is relevant, I’m running all this stuff on a Windows computer (I can do it on windows server 2008 or windows 10…)

bbbales2 · March 1, 2018, 7:55pm

Oh no, haha. I use Linux here, turns out the cmd is different for WIndows:

Try:

make model.exe

(from https://github.com/stan-dev/cmdstan/wiki/Getting-Started-with-CmdStan)

Topic		Replies	Views
How to implement a poisson count model into stan? Modeling rstan , fitting-issues , specification , poisson	1	485	January 25, 2023
Random Poisson variable is a negative but must be nonnegative! Modeling fitting-issues , poisson	6	400	February 13, 2024
Issues with modelling Poisson Data General fitting-issues	5	879	February 18, 2022
Looking for a stan program critique Modeling techniques	14	1666	April 17, 2018
Quasipoisson model using stan Modeling	2	1473	February 4, 2018

Help with Poisson model

Related topics