Unsure what the prior is doing!

KleinGordon · July 31, 2024, 8:52pm

I am trying to implement a stan model presented in “Statistical Rethinking” in Section 14.2, pg 438, R code 14.13. I’ve pasted the example code below:

model_code <- '
data{
   int N;
   int nc_num_missing;
   vector[N] kcal;
   real neocortex[N];
   vector[N] logmass;
   int nc_missing[N];
}
parameters{
   real alpha;
   real<lower=0> sigma;
   real bN;
   real bM;
   vector[nc_num_missing] nc_impute;
   real mu_nc;
   real<lower=0> sigma_nc;
}
model{
   vector[N] mu;
   vector[N] nc_merged;
   alpha ~ normal(0,10); 
   bN ~ normal(0,10);
   bM ~ normal(0,10);
   mu_nc ~ normal(0.5,1);
   sigma ~ cauchy(0,1);
   sigma_nc ~ cauchy(0,1);
   // merge missing and observed
   for ( i in 1:N ) {
      nc_merged[i] <- neocortex[i];
      if ( nc_missing[i] > 0 ) nc_merged[i] <- nc_impute[nc_missing[i]];
   }
   // imputation
   nc_merged ~ normal( mu_nc , sigma_nc );
   // regression
   mu <- alpha + bN*nc_merged + bM*logmass;
   kcal ~ normal( mu , sigma );
}'

My question specifically pertains to the instantiation of nc_merged and how the prior is assigned. My understanding is that mc_merged is an object which stores data points from neocortex if they exist, and then assigns an entry from nc_inpute to those entries. Thus, my guess is that the next line assigns a prior over the terms in nc_merged which are empty, and does nothing to the entries whose fields are present.

If I were correct, I would assume the following model definition would give identical results:

model{
  vector[N] mu;
  vector[N] nc_merged;
  alpha ~ normal(0,10);
  bN ~ normal(0,10);
  bM ~ normal(0,10);
  mu_nc ~ normal(0.5,1);
  sigma ~ cauchy(0,1);
  sigma_nc ~ cauchy(0,1);
  
  // imputation
  nc_impute ~ normal( mu_nc , sigma_nc );
  
  // merge missing and observed
  for ( i in 1:N ) {
    nc_merged[i] <- neocortex[i];
    if ( nc_missing[i] > 0 ) nc_merged[i] <- nc_impute[nc_missing[i]];
  }
  
  // regression
  mu <- alpha + bN * nc_merged + bM * logmass;
  kcal ~ normal( mu , sigma );
}'

where I assign a prior to nc_impute before constructing the nc_merged vector. However, the two models give me different inference results, and so I presume I am misunderstanding what the example in Statistical Rethinking is doing.

Any help clarifying this discrepancy would be much appreciated!

andre.pfeuffer · August 1, 2024, 10:30am

The “entries whose fields are present”, eg. observed data together with their priors estimate mu_nc, and sigma_nc. In the second model, these parameters are determined exclusively by the priors.

KleinGordon · August 1, 2024, 12:33pm

Ah, so would it be correct to say that the posterior values of mu_nc and sigma_nc which are calculated on the observed values of neocortex are used to impute the missing values in nc_inpute?

So in a pseudocode workflow it’d be something like:

neocortex ~ normal( mu_nc, sigma_nc);
nc_inpute ~ normal( mu_nc, sigma_nc);

where the first line specifies the likelihood between neocortex and the params mu_nc, sigma_nc, and the second line conducts the imputation?

andre.pfeuffer · August 1, 2024, 12:46pm

Since nc_impute is a parameter, nc_inpute ~ normal( mu_nc, sigma_nc);
creates a sample using the specified piors. And therefore it can be used for
imputation.

Take a look at the User manual:
https://mc-stan.org/docs/stan-users-guide/missing-data.html#missing-data

Topic		Replies	Views
Merging Missing and Observed Data in Regression Model Modeling	15	916	January 22, 2020
Question about priors in stan Modeling astronomy , priors	5	786	October 1, 2021
Priors for binomial trials in generalized linear models Modeling	3	512	July 6, 2018
SUG 1.13 ("Multivariate priors for hierarchical models"): missing prior & serious identification issue? Modeling specification	14	1284	October 22, 2021
Specifying a conditional prior Modeling rstan , techniques , specification	6	720	March 23, 2023

Unsure what the prior is doing!

Related topics