# Prior predictive checks by removing the likelihood gives single values

I’m trying to perform a prior predictive check by simply taking my Stan model and commenting out the likelihood but somehow I only get one value for each parameter that remains the same throughout the iterations while I would like to explore the full specified prior distribution.

Here is a sample from the Stan code that allows to reproduce the errors :

``````
data {
int<lower=0> N;               //N observations total
int<lower=1> P;               //N population-level effects  (intercept + beta)
int<lower=0> J;               //N subj
int<lower=1> n_u;             //N subj ranefs (ex. x * x | part = 4) if random effect on all effects, same argument as P can be passed
int<lower=1,upper=J> subj[N]; //subj ID vector
matrix[N,P] X;                //fixef design matrix
matrix[N, n_u] Z_u;           //subj ranef design matrix, X if raneff = fixeff
vector[N] y;                  // DV
vector p_intercept;        // prior on intercept
vector p_sd;               // prior on residual std
vector[P-1] p_fmu;            // priors on fixef mu
vector[P-1] p_fsigma;         // priors on fixef sigma
vector p_r;                // priors on ranef
}

parameters {
real alpha;                   // intercept
real<lower=0> sigma;          //residual std
vector[P-1] beta;             //population-level effects coefs (w\o intercept)
cholesky_factor_corr[n_u] L_u;//cholesky factor of subj ranef corr matrix
vector<lower=0>[n_u] sigma_u; //subj ranef std INCLUDING INTERCEPT, hence beta ~ sigma
vector[n_u] z_u[J];           //spherical subj ranef
}

model {
//vector[N] mu;
//priors
L_u ~ lkj_corr_cholesky(2.0);
alpha ~ normal(p_intercept, p_intercept);
beta ~ normal(p_fmu, p_fsigma);
sigma ~ normal(p_sd, p_sd);
for (j in 1:J)
z_u[j] ~ normal(p_r,p_r);

//likelihood
//for (n in 1:N)
//  mu[n] = alpha + X_beta[n] * beta + Z_u[n] * u[subj[n]];
//y ~ normal(mu, sigma);
}
``````

I then sample from it by using the `pystan` module and passing the real data :

``````fixeff_form = "1+Factor1+Factor2+Factor1:Factor2"#Fixed effects formula
raneff_form = fixeff_form #Random effects formula
fixeff = np.asarray(patsy.dmatrix(fixeff_form, LMEdata)) #FE design matrix
raneff = np.asarray(patsy.dmatrix(raneff_form, LMEdata)) #RE design matrix
prior_intercept = np.asarray([6.15,.3])#prior for intercept: mu and sigma
priors_mu = np.repeat(0, 8) #Priors on mu for FE
priors_sigma =  np.repeat(.4, 8) # priors on sigma for FE
priors_raneff = [0, .4] #Priors on RE
prior_sd = [0, .4] #priors on residual sigma

LME_data = dict(
N = len(LMEdata),
P = fixeff.shape[-1], #number of pop level effects
J = len(LMEdata.participant.unique()),
n_u = raneff.shape[-1],
subj = LMEdata.participant,
X = fixeff,
Z_u = raneff,
y = LMEdata.obs.get_values(),
p_intercept = prior_intercept, p_sd = prior_sd, p_fmu = priors_mu, p_fsigma = priors_sigma, p_r = priors_raneff
)

fit = priors_LME.sampling(data=LME_data, iter=1000, chains=1,  warmup =0)
``````

Behavior is the same whether I use the `Fixed_param` algorithm or not, when I include some warm-up I get some estimation but the distribution of these estimations is far more narrower than the given priors.

P.-S I know that this method is sub-optimal but using the generated quantities makes it complicated as there are also transformed parameters and other generated quantities in the full Stan model

1 Like

Hi,
you definitely need to include warmup - to Stan your “prior-only” model is just another distribution to sample from and it needs warmup. Are all the diagnostitcs (n_eff, Rhat, divergences, treedepth) good for the “prior only” fit?

people often do this with a an `int<lower=0,upper=1> prior_only` in data and then having `if(prior_only)` around the likelihood part (you can also have optional params depend on `prior_only`, e.g. `mu`, see my blog post on the topic ). This way you don’t need to recompile for prior checks and just pass `prior_only = 1` to the model.

Hope that helps!

1 Like

Hi Martin,

Thanks for your answer, I just tried to include warmups (up to 100000 with 5000 additional iterations) but the results are still off, see for example the trace for one of the parameters (which does not behave like this when I include the likelihood).

``````RT_fit = priors_LME.sampling(data=RT_LME_data, iter=105000, chains=1,
warmup =100000)
``````

Obviously divergences (in black in the plot), n_eff and Rhat are off the charts

Edit : However the distribution/trace of the MCMC chain does explore my prior (normal, mu=6.15, sd=0.4) but in a really weird way…

This may or may not be related to the issue at hand.

1. In the code in the original post, did you delete the transformed parameters block when posting it here in the forum? It feels like some bits and pieces are missing from the code.
2. `sigma_u` has no explicit prior. This may cause trouble with sampling and affect the other parameters as well.
3 Likes

Indeed, I initially tried to ease the reading of the code, anyway, adding a prior on `sigma_u` did fix the behavior !

Thanks a lot :)

1 Like

Just wondering if the link above is broken and should be this instead?

1 Like