Issues with modelling Poisson Data

I am trying to run the following model

Y_i|\alpha,\beta, x_i \sim \text{Poisson}(x_i e^{\alpha+\beta z_i})

Where Y_i, x_i z_i are known observations and \alpha and \beta come from non-informative prior distributions.

I am still new to Stan and I am running into a syntax issue for running this regression.

My code is:

data {
  int<lower=0> n; 
  vector[n] y;
  vector[n] x;
  vector[n] z;
}


parameters {
  real beta; 
  real alpha;
}


model {
  // Non-informative priors
  alpha ~ normal(0,1000);
  beta ~ normal(0,1000);
  
  // Model
  y ~ poisson(x*exp(alpha+beta*z));
  
}

I get the following error:

SYNTAX ERROR, MESSAGE(S) FROM PARSER:
No matches for: 

  vector * vector

Available argument signatures for operator*:

  real * real
  vector * real
  row_vector * real
  matrix * real
  row_vector * vector
  vector * row_vector
  matrix * vector
  row_vector * matrix
  matrix * matrix
  real * vector
  real * row_vector
  real * matrix

Expression is ill formed.
 error in 'model1a08157f26f0_A2Q3' at line 21, column 33
  -------------------------------------------------
    19:   beta ~ normal(0,1000);
    20:   
    21:   y ~ poisson(x*exp(alpha+beta*z));
                                        ^
    22:   
  -------------------------------------------------

I tried troubleshooting it with some googling and looking around here but I haven’t found a solution yet. Any help or pointers is appreciated! Thank you!

It looks like you want to do element-wise multiplication? If so, you can change your model to:

y ~ poisson(x .* exp(alpha+beta*z));

I believe you can specify this in a more stable and efficient way (that exp is likely to cause issues with overflow).

Your likelihood can be equivalently specified as:

Y_i \sim \text{Poisson}(x_i e^{\alpha+\beta z_i}) \\ Y_i \sim \text{Poisson}(e^{\log(x_i)} e^{\alpha+\beta z_i}) \\ Y_i \sim \text{Poisson}(e^{\log(x_i) + \alpha+\beta z_i})

Stan has the built-in distribution poisson_log_glm which handles the exponentiation internally in a more numerically robust way, and is also specialised for regression models.

So your Stan likelihood would be specified as:

data {
  int<lower=0> n; 
  vector[n] y;
  vector[n] x;
  vector[n] z;
}

transformed data {
  row_vector[n] log_x = log(x)';
}


parameters {
  real beta; 
  real alpha;
}


model {
  // Non-informative priors
  alpha ~ normal(0,1000);
  beta ~ normal(0,1000);
  
  // Model
  y ~ poisson_log_glm(z, log_x + alpha, beta);
 
}

Additionally, your priors for alpha and beta are extremely wide, and may cause sampling issues

2 Likes

Hey there! :)

Two more suggestions:

  1. I think you can use poisson_log(log(x) + alpha + beta*z), which should be a bit faster. You can also pre-compute log(x) in the transformed data block, which would be even a bit more efficient.
  2. I’d suggest to use much tighter priors here. The linear predictor is on a log scale, so SD’s of 1000 are huuuuge. Personally, I’d say even SD’s of 10 are not even weakly informative.

Cheers,
Max

Edit: @andrjohns beat me to it with an even better suggestion! :D

1 Like

So I checked the syntax and I get the following error:

SYNTAX ERROR, MESSAGE(S) FROM PARSER:
No matches for: 

  vector ~ poisson_log_glm_lpmf(vector, row_vector, real)

Available argument signatures for poisson_log_glm_lpmf:

  int[ ] ~ poisson_log_glm_lpmf(matrix, real, vector)
  int[ ] ~ poisson_log_glm_lpmf(matrix, vector, vector)

Real return type required for probability function.
 error in 'model19a81acd2756_A2Q3' at line 25, column 46
  -------------------------------------------------
    23:   
    24:   // Model
    25:   y ~ poisson_log_glm(z, log_x + alpha, beta);
                                                     ^
    26:  
  -------------------------------------------------

Error in stanc(filename, allow_undefined = TRUE) : 
  failed to parse Stan model 'A2Q3' due to the above error.
In addition: Warning message:
In readLines(file, warn = TRUE) :
  incomplete final line found on 'D:/Math6635/A2Q3.stan'

Ah right, there aren’t as many signatures for it. Try this version:

data {
  int<lower=0> n; 
  int y[n];
  vector[n] x;
  matrix[1,n] z;
}

transformed data {
  vector[n] log_x = log(x);
}


parameters {
  vector[1] beta; 
  real alpha;
}


model {
  // Non-informative priors
  alpha ~ normal(0,1000);
  beta ~ normal(0,1000);
  
  // Model
  y ~ poisson_log_glm(z, log_x + alpha, beta);
 
}
1 Like