Prior predictive checks: general implementation and truncated parameters

I have two hopefully simple questions related to implementing prior predictive checks (henceforth “Prior PC”) with rstan. The first is about implementation, the second is about simulating truncated parameters.

First, it seems that there are two ways to implement Prior PC.

  1. [Stan User’s Guide]((Stan User’s Guide) suggests to implement Prior PC using the data and generated quantities block, and to then sample the resulting stan model object.
  2. Many other examples of Prior PC, including brms’s option “sample_prior = ‘only’”, retain the data, parameter, model, and generated quantities. The model block only contains the prior sampling statements.

I expected these two implementations to provide the same results, but they do not on my machine. Below I given the R and STAN code for a regression with 10 data points.

library(rstan)
library(MASS)
library(tidyverse)

# STAN models 
suggested_ppc <- '
  
  data {
    int N;
    vector[N] x;
  }
  
  generated quantities {
    real alpha = normal_rng(0,1);
    real beta  = normal_rng(0,1);
    real y_sd  = gamma_rng(0.01,0.01); 
    
    real yhat[N];
    real y_sim[N];
    
    for(n in 1:N)
      yhat[n] = alpha + beta * x[n];
    
    for(n in 1:N)
      y_sim[n] = normal_rng(yhat[n], y_sd);
    
  }

'

brms_ppc      <- '
  
  data {
    int N;
    vector[N] x;
  }
  
  parameters {
    real alpha;
    real beta;
    real<lower=0> y_sd;
  }
  
  transformed parameters{
    vector[N] yhat;
    yhat = alpha + beta * x;
  }
  
  model {
    alpha ~ normal(0,1);
    beta  ~ normal(0,1);
    y_sd  ~ gamma(0.01,0.01); 
  }
  
  generated quantities {
    real y_sim[N];
    for(n in 1:N)
      y_sim[n] = normal_rng(yhat[n], y_sd);
  }

'

# Create stanmodel objects
suggested_ppc_mod  <- stan_model( model_code = suggested_ppc  )
brms_ppc_mod       <- stan_model( model_code = brms_ppc )

# create fictitious predictor data
set.seed( 2013 + 702)
dat <- list( x = rnorm( 10 ),
             N = 10 )

# fit stan models 
suggested_ppc_fit <- sampling( object = suggested_ppc_mod, 
                               data   = dat,
                               iter   = 4000,
                               warmup = 1000,
                               thin   = 2,
                               chains = 3,
                               algorithm = 'Fixed_param'
                              )

brms_ppc_fit      <- sampling( object = brms_ppc_mod, 
                               data   = dat,
                               iter   = 4000,
                               warmup = 1000,
                               thin   = 2,
                               chains = 3 
                            )

# The variance of output suggested_ppc always > brms_ppc
suggested_ppc_fit %>% 
  rstan::extract() %>% 
  .$y_sim %>% 
  as.numeric %>% 
  sd(na.rm=T) # Did not understand why there are NAs!

brms_ppc_fit %>% 
  rstan::extract() %>% 
  .$y_sim %>% 
  as.numeric %>% 
  sd()

In this example, the method #1 suggested in Stand User’s Guide always provides a larger variance in the response variable than the “brms” method.
Is this supposed to happen, and which method is to be preferred?

Second, I need to perform a Prior PC on a model that has two parameters that are truncated (e.g. lower=1, upper=6.3>).
Am I correct that defining a parameter with bounds such as <lower=1, upper=6.3>, means that the parameter is truncated effectively?
And is it correct that there are no truncated random number generators in STAN?
If the answer to both of these questions is yes, then am I better off doing a prior PC directly in R, Python or some other environment where I can generate truncated distributions?

Thank you in advance, and apologies if these questions are not clear enough!

1 Like

Sampling the prior in the generated quantities block uses independent random samples from the specified distribution (like using rnorm in R). brms is sampling the prior via MCMC. I would not expect the results to be identical, particularly if the prior is quite disperse.

For truncated prior predictive checks, whether or not one can sample form the truncated prior in the generated quantities block using the truncated inverse CDF method depends on wether the inverse CDF is known analytically. You could also use a while loop to sample from the ‘untruncated’ prior distribution until a sample falls in the feasible region, but this could take a long time.

2 Likes

When I run your code, I notice that the “brms method” gives a lot of divergent transitions. It is no surprise, then, that you are recovering biased samples from the posterior (which in this case is just the prior).

The reason that this prior model is hard for Stan to sample from is because the prior on y_sd has such a long tail.

2 Likes

This is also known as “rejection sampling” and as long as the mass within the truncation bounds is not quite small, it could actually be faster than the inverse CDF method. (for many distributions evaluating the RNG several times is still faster than evaluating the inverse CDF just once)

Thank you for the answers,
This was very helpful!

First, I did not mention the divergent transitions issue, so thanks for pointing out what it was driven by.
Second, mea culpa for overlooking truncated random number generation in STAN’s User Guide. I will need some time to understand the use of cdf, but fortunately in my simulations I am only using the truncated normal distribution, for which the user guide provides an example!

For the record, while waiting for responses I had also implemented a prior predictive check in R, using the truncdist package for truncated distributions. This prior predictive check seems to have worked well - I could substantially increase the predictive power of one of my models, whose priors on effect sizes were too uniformative and led the model to overfit the data!

2 Likes