I’d like to use STAN to fit a model similar to that in section 15.1 of Gelman and Hill’s “Data Analysis Using Regression and Multilevel / Hierarchical Models” (ARM):

\begin{align*}
y_i &\sim \text{Poisson}(u_i e^{X_i\beta + \epsilon_i})\\
\epsilon_i &\sim \text{Normal}(0, \sigma_\epsilon^2)
\end{align*}

where u_i is an observed offset, y_i is an observed outcome, X_i is an observed vector of covariates, and \epsilon_i is an unobserved error. I know how to fit this model without the \epsilon_i term, so my question is about how to implement the normal mixture.

I found this repo with STAN code for most of the examples from ARM, but it doesn’t contain code for any of the examples based on the police stops dataset. (Incidentally, a version of that dataset with noise added is available here and some discussion is available here).

Any suggestions you can offer on how to fit the model described above using STAN would be most appreciated. If doing so is hard / impossible / inadvisable I’d also be interested to know why. I’m aware that a negative binomial model provides another way to allow over-dispersion in count data, and I know how to fit such a model in STAN. The reason I’m interested in this normal mixture of Poissons is that I ultimately want to build a more complicated model that uses this one as a building block, including correlated normal error terms across multiple “equations.” (Basically, an instrumental-variables type model but for count data.)

Here we are taking the model without the \epsilon_i and adding an observation level random effect. This will likely be fit most effectively in the non-centered parameterization, which would look like this:

```
parameters{
real<lower = 0> sigma; // sd of the normal
vector<multiplier = sigma>[N] errors; // the errors, non-centered parameterization
...
}
model{
errors ~ normal(0, sigma);
...<compute the linear predictor on the log scale>...
<linear predictor> += errors;
...<the Poisson likelihood>...
}
```

This is really helpful: thanks! I think I was getting hung up on the issue of the mixture likelihood lacking a closed-form, whereas it’s more natural to just think of this as a hierarchical model as your example code suggests. I’m still a little unclear on how to correctly use the affinely-transformed parameter feature `<multiplier = sigma>`

so I ended up implementing the non-centered parameterization “the hard way.” I coded up a little example with a single regressor and it seems to work well in simulated data:

```
data {
int<lower=0> N;
vector[N] x;
array[N] int<lower = 0> y;
}
parameters {
real<lower = 0> sigma;
real alpha;
real beta;
vector[N] errors;
}
model {
sigma ~ exponential(1);
beta ~ normal(0, 0.2);
alpha ~ normal(0, 1.5);
errors ~ normal(0, 1);
y ~ poisson_log(alpha + beta * x + sigma * errors);
}
```

1 Like