Modeling a hierarchical poisson model

yanmin · November 25, 2024, 3:49am

i want to fitting a model developed by Zainab Al-kaabawi , Yinghui Wei & Rana Moyeed(Bayesian hierarchical models for linear networks,Journal of Applied Statistics).
Write a model: counts y1,…,yn distributed according to a Poisson distribution with mean Lλ, in terms of the logarithm of the mean,θ
= logλ, Complete the model by assigning a
N(μ,σ) prior to the log mean parameter θ. Now, i have two variables:death denoting death cases due to accident in a province, and length denoting the length of roads in a province.
my Stan is as follows,

data {
  int<lower=0> N;
  vector[N] length;
  vector[N] death;
}

parameters {
  vector[N] lambda; 
}
model {
  lambda ~ normal(0,100);
   death ~ poisson(exp(lambda)*length);
}
"

the result show that death ~ poisson(exp(lambda)*length) is wrong.
The message is “Ill-typed arguments supplied to infix operator *”

how to resolve the issue? thank you in advance.

sonicking · November 25, 2024, 11:05am

Make it a loop:

For (n in 1:N) death[n] ~ poisson(exp(lambda[i]*length[i]))

yanmin · November 25, 2024, 12:05pm

It is a pity that your advice does not work.
my codes are as follows,


death <- c(1001,  927, 2181, 2226,  784, 1805,
           1482,  819,  811, 3645, 2678, 2356,
           1594, 1474, 3009, 2652, 4287, 3270,
           7246, 2677,  604,  936, 2273, 2655,
           1908,  278,  917, 1307,  463,  447,1316)
length <- c(22363,  15230, 209209, 145469,
            216176, 131065, 109761, 168958,
            13005, 158000, 122918, 237967,
            112878, 210711, 291759, 277482,
            302178, 242420, 223081, 172391,
            41687, 186137, 405390, 209617,
            316091, 120852, 185607, 157243,
            87726,  38347, 223118)

library(rstan)

accident <- "
data {
  int<lower=0> N;
  vector[N] length;
  vector[N] death;
}

parameters {
  vector[N] lambda; 
}
model {
  vector[N] theta;
  lambda ~ normal(0,100);
  for (n in 1:N) {
  theta[n] = exp(lambda[n])*length[n];
  death[n] ~ poisson(theta[n]);
  }
}
"
write(accident,file = 'accident.stan')

data_for_Stan <- list(N = length(death),
                      length = length, 
                      death = death)

fit <- stan(file = "accident.stan", data = data_for_Stan)
summary(fit)$summary

can you run my codes? thank you in advance.

sonicking · November 25, 2024, 12:15pm

What do you mean by the code doesn’t work? Did you see another error message from compiling?

On a different note: I don’t know anything about your model or the reference you gave, but it is over-parameterized in my opinion. I don’t think there is enough data in this setup to learn about each lambda.

yanmin · November 25, 2024, 12:29pm

thank your reply.
The two-level hierarchical model for traffic accidents is given below,
death_i ~ poisson(length_i * theta_i)
log(theta_i) ~ N(alpha,tau^2)
i think two columns,death and length, is enough to estimate the model. Is right my understanding?

kaskogsholm · November 25, 2024, 4:59pm

I think the problem is with the product operator… * is the operator for a matrix multiplication, and .* is the operator for element-wise multiplication. See this section of the manual.

sonicking · November 25, 2024, 6:47pm

Technically no, because the input for poisson should be reals.

sonicking · November 25, 2024, 6:50pm

On the modeling front, the data you supplied are sufficient only for a single lambda to be estimated.

You need more data per each subject to allow differentiated lambdas. You are not providing data or different priors for the differentiation.

kaskogsholm · November 25, 2024, 8:28pm

reals is a pseudotype that includes a vector of reals or a scalar real, so I’m not sure that I understand your comment. See this section of the user manual.

On the modeling front, it was my understanding that there is no minimum sample size in Bayesian computation. There are perfectly valid Bayesian models where there are more parameters than observations.

The posterior distributions for each lambda may be different or very similar depending on the values of death and length, If every value of the length vector is identical, and the values of the death vector differ by 5 orders of magnitude, you will probably learn something interesting about the lambdas. Even if the lambda posteriors turned out to be identical, you’re still getting valid information. Since the title of the post mentions a hieracrchical model, I’m assuming that eventually, yanmin will make each lambda be a draw from some lambda distribution:

model {
   // insert priors for lambda_mean, sigma
   lambda ~ normal(lambda_mean, sigma);
   death ~ poisson(exp(lambda)*length);
}

In which case, if all the lambdas are very close together, that will show up as a small sigma parameter, and we might come to the conclusion that length of roads is the most important variable explaining the variance between province accident rates.

ssp3nc3r · November 26, 2024, 3:14pm

data {
  int<lower=0> N;
  vector[N] length;
  array[N] int death;
}

parameters {
  vector[N] lambda; 
}
model {
  lambda ~ normal(0,100);
   death ~ poisson(exp(lambda) .* length);
}

As others mentioned, you need element-wise multiplication .* between two vectors here, and the signature for a Poisson requires ints: Unbounded Discrete Distributions so: array[N] int death;

I don’t know what your intentions of the model are, but this will address your syntax errors.

yanmin · November 27, 2024, 7:30am

thank you very much. your codes run well.

yanmin · December 2, 2024, 12:57pm

I found other solution to my question. The code is as follows,

data {
  int<lower=0> N;
  vector[N] length;
  array[N] int death;
}

parameters {
  vector[N] lambda; 
}
model {
  lambda ~ normal(0,100);
  for (n in 1:N) {
   death[n] ~ poisson(exp(lambda)[n] * length[n]);
}
}

sonicking · December 2, 2024, 4:48pm

My mistake in my initial response. I was confused by the use of the word hierarchical. I am more familiar when the model being used here is called an over-dispersed Poisson, where the individual Poisson rate is distributed according to the log-normal distribution. In that case, yes, the model is well-defined and there is no issue with sample size whatsoever.

This is a blog post that demonstrates this model well: The right way to do predictive checks with observation-level random effects – Erik J. Ringen, PhD

Bob_Carpenter · December 3, 2024, 6:09am

I’d recommend this change to avoid the danger of overflow or underflow. The _log says that it wants its argument on the log scale.

death ~ poission_log(lambda + log(length));

You can actually take in log_length as data and avoid the run-time log, though it’s going to be a minor cost when the argument is just data rather than a parameter.

This just does the same thing explicitly. If you want to do it your way, it’ll be much more efficient to do exp(lambda[n]), because otherwise you’re running exp(lambda) on a whole vector for N iterations, each time only selecting the n-th element.

I think the original post here was asking for a hierarchical model, as the citaiton clearly says “hierarchial models”, but the original code wasn’t hierarchical. To make this model hierarchical, you can do this (the offset and multiplier give it a non-centered parameterization, which is what you want for the size of data you’re indicating):

parameters {
  real mu;
  real<lower=0> sigma;
  vector<offset=mu, multiplier=sigma>[N] lambda;
}
model {
  mu ~ normal(0, 10);
  sigma ~ exponential(0.1);
  lambda ~ normal(mu, sigma);
  death ~ poisson_log(lambda + log(length));
}

This is also assuming that lambda values are going to be roughly unit scale. Here, the lambda values will be roughly exp(lambda) * length =approx= death, so lambda will be around log(death / length) for this data, or about -3 or -4 or even lower—but they won’t be 100.

I’d strongly recommend putting your Stan code in its own files. Then you can print with quotes and you can also more easily diagnose problems with line numbers.

yanmin · December 3, 2024, 7:29am

i think hierarchical model is every group has their own parameter, and all parameters share a common distribution.

yanmin · December 3, 2024, 7:30am

good idea!

yanmin · December 4, 2024, 9:51am

i use your code and the message shows that death ~ poisson_log(lambda + log(length)) is wrong. The error is as follows,
Ill-typed arguments supplied to function ‘poisson_log’:
(vector, vector)
Available signatures:
(array int, row_vector) => real
The first argument must be array int but got vector
(array int, vector) => real
The first argument must be array int but got vector
(array int, array real) => real
The first argument must be array int but got vector
(array int, real) => real
The first argument must be array int but got vector
(int, row_vector) => real
The first argument must be int but got vector
(Additional signatures omitted)

Bob_Carpenter · December 10, 2024, 9:32pm

You can see from the above that you supplied two vector arguments, but that is not an available signature for poisson_log. That’s because the first argument needs to be an array of integers—Stan cleanly separates integers and real values.

yanmin · December 13, 2024, 1:57am

following your advice, i solved the issue. thank you!

Topic		Replies	Views
Hierarchical Bayesian Poisson regression model Modeling fitting-issues	7	2339	August 17, 2019
Correct vectorization in hierarchical model Modeling techniques , specification	9	547	April 14, 2020
Why different priors produce very large results Modeling specification	9	216	December 29, 2024
Issues with fitting this poisson-regression Modeling fitting-issues , poisson	5	504	December 18, 2020
Help with Poisson model Modeling loo	45	7586	March 3, 2018

Modeling a hierarchical poisson model

Related topics