As long as you can follow our style, we love examples in the User’s Guide. The way to do that is create a pull request in the stan-dev/docs repo.
We’d just marginalize out discrete parameters, which many people find difficult to code. There are examples in the user’s guide chapter on latent discrete parameters. It’s usually way
more efficient to marginalize out discrete parameters than to sample them with Gibbs unless you can be very clever with blockwise Gibbs sampler (i.e., roll your own rather than using BUGS/JAGS).
These models are simple enough that you can probably just code them as written mathematically in Stan. If someone can be more specific about which model they want, I’d be happy to code it into a simple Stan program. For example, here’s a 2-way “independence” Poisson model as described in BDA3, section 16.7 and also in Alan Agresti’s An Introduction to Categorical Data Analysis in section 7.1 (log linear models for two-way and three-way tables):
data {
int<lower=0> M;
int<lower=0> N;
array[M, N] int<lower=0> y;
}
parameters {
real alpha;
vector[M] beta;
vector[N] gamma;
}
model {
alpha ~ ... prior ...
beta ~ ... prior ...
gamma ~ ... prior ...
for (m in 1:M) {
for (n in 1:N) {
y[m, n] ~ poisson_log(alpha + beta[m] + gamma[n]);
}
}
}
Here, poisson_log(lambda) = poisson(exp(lambda))
, so it’s still the log linear model being suggested.
I don’t know much about these models, but the prior needs to be sensitive to the scale of the counts or there needs to be something like a constant rate term plus logit link.
Then if you want to build the “saturated” model, you add another predictor
matrix[M, N] delta;
and give it a prior and add it to the linear model
y[m, n] ~ poisson_log(alpha + beta[m] + gamma[n] + delta[m, n]);
These models can be hard to identify with weakly informative independent priors because of the additive non-identifiabilities. You can try to deal with the non-identifiabilities as @andrewgelman et al. suggest and pin some of the values in beta
, gamma
, and delta
to zero for identifiability. Or set them up so that they sum to zero.
P.S. I didn’t understand @andrewgelman et al.'s discussion in the Prior Distributions section because it suggests putting a prior on the derived quantity mu[m, n] = alpha[m] + beta[n]
. We can do that here directly because the Jacobian determinant is constant, but it’s going to be overparameterized in that we get m * n
prior terms for m + n
parameters.