Poisson regression error

Hi all,

Can someone tell me what this error (at the bottom) is about. The error continues until the program seizes up.

Thanks,

David


title: “Models for Count Data”
subtitle: “EP 763: Regression Models in Education”
author: “David Kaplan”
output: pdf_document

knitr::opts_chunk$set(echo = TRUE)
library(rstan)
library(bayesplot)

Poisson Regression

School administrators study the attendance behavior of high school juniors
at two schools. Predictors of the number of days of absence include gender
of the student and standardized test scores in math and language arts.

poissonex <- read.csv("~/desktop/poissonreg.csv",header=T)
data.list <- with(poissonex, list(daysabs=daysabs, math=math, langarts=langarts, male=male, N = nrow(poissonex)))

Plot histograms of daily absentee data

attach(poissonex)
hist(daysabs,breaks=150,xlim=c(0,50), col="gray")
modelString = "
data {
     int<lower=0> N;
     real math[N];
     real langarts[N];                      
     real male[N];
     int<lower=0>  daysabs[N];             
}             
 
parameters {                          // Cute trick to give all of the parameters the same prior
     vector[4] b;
}
 
transformed parameters {
     real mu[N];
          for (n in 1:N)
          mu[n] = exp(b[1] + b[2]*math[n] + b[3]*langarts[n] +     
                  b[4]*male[n]); 
} 
 
model {
     for (n in 1:N)
          daysabs[n] ~ poisson(mu[n]);       
          b ~ normal(0, 1); 
}         
 
generated quantities {                 
     int<lower=0> daysabs_pred[N];
     real log_lik[N];
     for (n in 1:N) {
         daysabs_pred[n] = poisson_rng(mu[n]);
          log_lik[n] = poisson_lpmf(daysabs[n] | mu[n]);
     }
}
"

Start estimation

nChains = 2
nIter= 3000
thinSteps = 10
burnInSteps = floor(nIter/2)

daysabs = data.list$daysabs

PoissRegfit = stan(data=data.list,model_code=modelString,chains=nChains,
              iter=nIter,warmup=burnInSteps,thin=thinSteps)

SAMPLING FOR MODEL ‘89098a1c1b53b80a32e60b991eef32bb’ NOW (CHAIN 1).
Chain 1:
Chain 1: Gradient evaluation took 8.7e-05 seconds
Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.87 seconds.
Chain 1: Adjust your expectations accordingly!
Chain 1:
Chain 1:
Chain 1: Iteration: 1 / 3000 [ 0%] (Warmup)
Chain 1: Exception: poisson_rng: Rate parameter is 9.73758e+38, but must be less than 1.07374e+09 (in ‘model1fba7e50c325_89098a1c1b53b80a32e60b991eef32bb’ at line 29)

The error:

Exception: poisson_rng: Rate parameter is 9.73758e+38, but must be less than 1.07374e+09

Is saying that your mu parameter is a massively large number (9.738 * 10^{38}), and larger than the poisson_rng function is able to generate numbers for. It could be that your covariate values are too large to be exponentiating, and you may need to scale them first.

An alternative is to use the poisson_log distribution and rng functions, which do not require exponentiating mu: https://mc-stan.org/docs/2_24/functions-reference/poisson-distribution-log-parameterization.html.

Additionally, you may also want to use the poisson_log_glm distribution, which will be more efficient (and can also be GPU-accelerated if you have large enough data): https://mc-stan.org/docs/2_24/functions-reference/poisson-log-glm.html

More a cosmetic recommendation than anything, you can also simplify your model by using Stan’s vectorisation:

data {
  int<lower=0> N;
  vector[N] math;
  vector[N] langarts;
  vector[N] male;
  int<lower=0>  daysabs[N];
}

parameters {
  vector[4] b;
}
 
transformed parameters {
 vector[N] mu = exp(b[1] + b[2]*math + b[3]*langarts + b[4]*male);
} 

model {
  b ~ std_normal();
  daysabs ~ poisson(mu);
}

generated quantities {
  int<lower=0> daysabs_pred[N] = poisson_rng(mu);
  vector[N] log_lik;

  for (n in 1:N) {
    log_lik[n] = poisson_lpmf(daysabs[n] | mu[n]);
  }
}

Thanks! I made the following changes: I scaled two predictors that were large and I removed observations that had over 10 days absent (this is just a demo and not a real analysis).

poissonex <- read.csv("~/desktop/poissonreg.csv",header=T)
attach(poissonex)
poissonex <- subset(poissonex, daysabs < 10)
math = math/100
langarts = langarts/100
data.list <- with(poissonex, list(daysabs=daysabs, math=math, langarts=langarts, male=male, N = nrow(poissonex)))

Plot histograms of daily absentee data

attach(poissonex)
hist(daysabs,breaks=150,xlim=c(0,50), col="gray")
modelString = "
data {
     int<lower=0> N;
     real math[N];
     real langarts[N];                      
     real male[N];
     int<lower=0>  daysabs[N];             
}             
 
parameters {                          // Cute trick to give all of the parameters the same prior
     vector[4] b;
}
 
transformed parameters {
     real mu[N];
          for (n in 1:N)
          mu[n] = b[1] + b[2]*math[n] + b[3]*langarts[n] +     
                  b[4]*male[n]; 
} 
 
model {
     for (n in 1:N)
          daysabs[n] ~ poisson_log(mu[n]);       
          b ~ normal(0, 1); 
}         
 
generated quantities {                 
     int<lower=0> daysabs_pred[N];
     real log_lik[N];
     for (n in 1:N) {
         daysabs_pred[n] = poisson_log_rng(mu[n]);
          log_lik[n] = poisson_log_lpmf(daysabs[n] | mu[n]);
     }
}
"

The algorithm goes through the first two chains fine but then for the chain 3 i get

SAMPLING FOR MODEL ‘c50f98cb2f34cf63786804b61abc5660’ NOW (CHAIN 3).
Chain 3:
Chain 3: Gradient evaluation took 5.3e-05 seconds
Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.53 seconds.
Chain 3: Adjust your expectations accordingly!
Chain 3:
Chain 3:
Chain 3: Iteration: 1 / 30000 [ 0%] (Warmup)
Chain 3: Exception: poisson_log_rng: Log rate parameter is 34.8941, but must be less than 20.7944 (in ‘model43b219315bf_c50f98cb2f34cf63786804b61abc5660’ at line 31)

Chain 3: Exception: poisson_log_rng: Log rate parameter is 34.8941, but must be less than 20.7944 (in ‘model43b219315bf_c50f98cb2f34cf63786804b61abc5660’ at line 31)

It is overflowing again (but on the log scale). Initialising with init = 0 or init_r = 0.01 should help (put these options inside of your call to stan()).

That did the trick. I could even go back and keep everything in the original scale. Thanks.

1 Like