I replicate a paper titled ’ Bayesian hierarchical models for linear networks’(J Appl Stat
. 2020 Dec 29;49(6):1421-1448).
The model is specified, y_i \sim poisson(L_i* exp(lambda_i)) \\
lambda_i \sim N(\mu,\sigma^2) \\
\mu \sim N(\alpha, \tau^2) \\
\sigma^2 \sim Inverse-Gamma(\alpha_0,\beta_0)
conjugate <- "
data {
int<lower=0> N;
vector[N] length;
array[N] int accidents;
}
transformed data{
vector[N] log_len;
log_len = log(length);
}
parameters {
real mu;
real<lower=0> sigma;
vector<offset = mu, multiplier = sigma>[N] lambda;
}
transformed parameters {
real<lower=0> sigma2 = sigma^2;
}
model {
mu ~ normal(-6.65,0.09);
sigma2 ~ inv_gamma(18.36, 58.06);
lambda ~ normal(mu,sigma);
for (n in 1:N) {
accidents[n] ~ poisson_log(lambda[n] + log_len[n]);
}
}
The second model differ in the hyperparameters for \mu and \sigma^2
model {
mu ~ normal(0,10);
sigma2 ~ inv_gamma(0.001,0.001);
lambda ~ normal(mu,sigma);
for (n in 1:N) {
accidents[n] ~ poisson_log(lambda[n] + log_len[n]);
}
}
The first model have the output:mean(\mu) = -6.55, mean(\sigma) = 5.07, but the second have the output: mean(\mu) = 1.057, mean(\sigma) = 0.602.
Why different priors produce so large results? Is the results acceptable?
I only have a little experience coding directly in Stan, so others can correct me if I’m wrong, but I think the prior on \mu in the first model is practically fixed to -6.55 given that the SD is so small (0.09). The priors for \sigma are also different, but even if they weren’t, fixing a value of one parameter like this with affect the samples of others.
Whether or not priors are feasible generally needs some theory. But it looks like the data and the priors are giving different information as to what plausible values of \mu are.
Otherwise, the first response is correct - the difference between:
mu ~ normal(-6.65,0.09);
and
mu ~ normal(-0, 10);
is that in the first case, you’re pretty much constraining to mu to be -6.65, in the 2nd case, you have a very diffuse prior on mu, centered on zero. guidance on prior choice is available here: Prior Choice Recommendations · stan-dev/stan Wiki · GitHub
i would like to perform prior predictive checks for this sample, my codes are as follows,
data {
int<lower=0> N;
vector[N] length;
}
generated quantities {
real mu_ppc = normal_rng(-6.65,0.09);
real sigma2_ppc = inv_gamma_rng(18.36, 58.06);
vector[N] lambda_ppc;
array[N] int y_sim;
for (i in 1:N) lambda_ppc[i] = normal_rng(mu_ppc,sigma2_ppc);
y_sim = poisson_log_rng(exp(lambda_ppc) .* length);
}
'
the message is as follows,
SAMPLING FOR MODEL ‘anon_model’ NOW (CHAIN 1).
[1] “Error : Must use algorithm="Fixed_param" for model that has no parameters.”
[1] “error occurred during calling the sampler; sampling not done”
Since your Stan code doesn’t have a parameters block (I.e., your model doesn’t have any latent parameters), you need to use the algorithm=“Fixed_param” argument for rstan.
If you’re using cmdstanr, you’d have to use the fixed_param = TRUE argument in the call to sample.
because the results between the conjugate prior and non-informative prior are largely different ,I perform a prior predictive check for the above sample.
The R codes are as follows,
the mean of a vector of integers is not necessarily an integer.
> x = c(1, 2, 3, 4, 5, 6)
> mean(x)
[1] 3.5
it’s possible that poisson_rng is blowing up.
Rpoisson_rng(reals lambda)
Generate a Poisson variate with rate lambda; may only be used in transformed data and generated quantities blocks. lambda must be less than 2^{30}.
if this happens in a generated quantities block, the error is not fatal and the sampler will continue to the next iteration - the value of y_sim is never changed and it looks like it was initialized to zero.
That’s because there’s no model block in this Stan program.