Why different priors produce very large results

I replicate a paper titled ’ Bayesian hierarchical models for linear networks’(J Appl Stat
. 2020 Dec 29;49(6):1421-1448).
The model is specified,
y_i \sim poisson(L_i* exp(lambda_i)) \\ lambda_i \sim N(\mu,\sigma^2) \\ \mu \sim N(\alpha, \tau^2) \\ \sigma^2 \sim Inverse-Gamma(\alpha_0,\beta_0)

The R codes is as follows,

library(dplyr);library(purrr);ibrary(rstan)

#// path = 'https://pmc.ncbi.nlm.nih.gov/articles/PMC9042008/pdf/CJAS_49_1864814.pdf'
path = 'D:\\bayes\\Bayesian hierarchical models for linear networks.pdf'
df = tabulapdf::extract_tables(path,pages = 4)

accident = df[[1]] %>% select(2:3) 
names(accident) = c('length','accidents')
data_for_Stan = list(N = length(accident$accidents),
                      length = accident$length, 
                      accidents = accident$accidents)

The first stan file:

conjugate <- "
data {
  int<lower=0> N;
  vector[N] length;
  array[N] int accidents;
}

transformed data{
  vector[N] log_len;
  log_len = log(length);
}

parameters {
  real mu;
  real<lower=0> sigma;
  vector<offset = mu, multiplier = sigma>[N] lambda; 
}

transformed parameters {
  real<lower=0> sigma2 = sigma^2;
}

model {
  mu ~ normal(-6.65,0.09);
  sigma2 ~ inv_gamma(18.36, 58.06);
  lambda ~ normal(mu,sigma);
  for (n in 1:N) {
  accidents[n] ~ poisson_log(lambda[n] + log_len[n]);
  }
}

The second model differ in the hyperparameters for \mu and \sigma^2

model {
  mu ~ normal(0,10);
  sigma2 ~ inv_gamma(0.001,0.001);
  lambda ~ normal(mu,sigma);
  for (n in 1:N) {
  accidents[n] ~ poisson_log(lambda[n] + log_len[n]);
  }

}

The first model have the output:mean(\mu) = -6.55, mean(\sigma) = 5.07, but the second have the output: mean(\mu) = 1.057, mean(\sigma) = 0.602.
Why different priors produce so large results? Is the results acceptable?

I only have a little experience coding directly in Stan, so others can correct me if I’m wrong, but I think the prior on \mu in the first model is practically fixed to -6.55 given that the SD is so small (0.09). The priors for \sigma are also different, but even if they weren’t, fixing a value of one parameter like this with affect the samples of others.

Whether or not priors are feasible generally needs some theory. But it looks like the data and the priors are giving different information as to what plausible values of \mu are.

1 Like

The Stan User’s Guide has a section on how to code and run prior predictive checks: Posterior and Prior Predictive Checks

Otherwise, the first response is correct - the difference between:

mu ~ normal(-6.65,0.09);

and

mu ~ normal(-0, 10);

is that in the first case, you’re pretty much constraining to mu to be -6.65, in the 2nd case, you have a very diffuse prior on mu, centered on zero. guidance on prior choice is available here: Prior Choice Recommendations · stan-dev/stan Wiki · GitHub

2 Likes

i would like to perform prior predictive checks for this sample, my codes are as follows,

  data {
    int<lower=0> N;
    vector[N] length;
  }
  generated quantities {
    real mu_ppc = normal_rng(-6.65,0.09);
    real sigma2_ppc = inv_gamma_rng(18.36, 58.06);
    
    vector[N] lambda_ppc;
    array[N] int y_sim;
    
    for (i in 1:N) lambda_ppc[i] = normal_rng(mu_ppc,sigma2_ppc);
    y_sim = poisson_log_rng(exp(lambda_ppc) .* length);
  }

'

the message is as follows,
SAMPLING FOR MODEL ‘anon_model’ NOW (CHAIN 1).
[1] “Error : Must use algorithm="Fixed_param" for model that has no parameters.”
[1] “error occurred during calling the sampler; sampling not done”

how to solve the issue? any help is appreciated.

Since your Stan code doesn’t have a parameters block (I.e., your model doesn’t have any latent parameters), you need to use the algorithm=“Fixed_param” argument for rstan.

If you’re using cmdstanr, you’d have to use the fixed_param = TRUE argument in the call to sample.

1 Like

that extremely annoying fixed_param problem has finally been addressed in the latest release of Stan - Release v2.36.0 (10 December 2024) · stan-dev/cmdstan · GitHub

3 Likes

You are right. thank you very much!

because the results between the conjugate prior and non-informative prior are largely different ,I perform a prior predictive check for the above sample.
The R codes are as follows,

 # path <- 'https://pmc.ncbi.nlm.nih.gov/articles/PMC9042008/pdf/CJAS_49_1864814.pdf'
path <- 'D:\\bayes\\Bayesian hierarchical models for linear networks.pdf'
df <- tabulapdf::extract_tables(path,pages = 4)

accident <- df[[1]] %>% select(2:3) 
names(accident) <- c('length','accidents')
data_for_Stan <- list(N = length(accident$accidents),
                      length = accident$length, 
                      accidents = accident$accidents)

library(rstan)
# prior predictive checks
ppc <- '
  data {
    int<lower=0> N;
    vector[N] length;
  }

  generated quantities {
    real alpha = normal_rng(-6.65,0.09);
    real<lower=0> tau = sqrt(inv_gamma_rng(18.36, 58.06));
    vector[N] lambda;
    array[N] int<lower=0> y_sim;
    for (i in 1:N) lambda[i] = normal_rng(alpha,tau);
    y_sim = poisson_rng(exp(lambda) .* length);
  }

'
write(ppc,file = 'ppc.stan')
data_for_ppc <- list(N = nrow(accident),
                     length = accident$length)
fit.ppc <- stan(file = "ppc.stan", data = data_for_ppc,
                algorithm = "Fixed_param")

The results is not satisfying. First, the mean of y_sim is not integer. Second, many of the quantiles of y_sim are zero. Third, lp__ is zero or NA.

What’s wrong with my codes? Any help is appreciated.

the mean of a vector of integers is not necessarily an integer.

> x = c(1, 2, 3, 4, 5, 6)
> mean(x)
[1] 3.5

it’s possible that poisson_rng is blowing up.

R poisson_rng (reals lambda)
Generate a Poisson variate with rate lambda; may only be used in transformed data and generated quantities blocks. lambda must be less than 2^{30}.

if this happens in a generated quantities block, the error is not fatal and the sampler will continue to the next iteration - the value of y_sim is never changed and it looks like it was initialized to zero.

That’s because there’s no model block in this Stan program.

Thank you for explaining!