I’m trying to perform a prior predictive check by simply taking my Stan model and commenting out the likelihood but somehow I only get one value for each parameter that remains the same throughout the iterations while I would like to explore the full specified prior distribution.
Here is a sample from the Stan code that allows to reproduce the errors :
data {
int<lower=0> N; //N observations total
int<lower=1> P; //N population-level effects (intercept + beta)
int<lower=0> J; //N subj
int<lower=1> n_u; //N subj ranefs (ex. x * x | part = 4) if random effect on all effects, same argument as P can be passed
int<lower=1,upper=J> subj[N]; //subj ID vector
matrix[N,P] X; //fixef design matrix
matrix[N, n_u] Z_u; //subj ranef design matrix, X if raneff = fixeff
vector[N] y; // DV
vector[2] p_intercept; // prior on intercept
vector[2] p_sd; // prior on residual std
vector[P-1] p_fmu; // priors on fixef mu
vector[P-1] p_fsigma; // priors on fixef sigma
vector[2] p_r; // priors on ranef
}
parameters {
real alpha; // intercept
real<lower=0> sigma; //residual std
vector[P-1] beta; //population-level effects coefs (w\o intercept)
cholesky_factor_corr[n_u] L_u;//cholesky factor of subj ranef corr matrix
vector<lower=0>[n_u] sigma_u; //subj ranef std INCLUDING INTERCEPT, hence beta[1] ~ sigma[2]
vector[n_u] z_u[J]; //spherical subj ranef
}
model {
//vector[N] mu;
//priors
L_u ~ lkj_corr_cholesky(2.0);
alpha ~ normal(p_intercept[1], p_intercept[2]);
beta ~ normal(p_fmu, p_fsigma);
sigma ~ normal(p_sd[1], p_sd[2]);
for (j in 1:J)
z_u[j] ~ normal(p_r[1],p_r[2]);
//likelihood
//for (n in 1:N)
// mu[n] = alpha + X_beta[n] * beta + Z_u[n] * u[subj[n]];
//y ~ normal(mu, sigma);
}
I then sample from it by using the pystan
module and passing the real data :
fixeff_form = "1+Factor1+Factor2+Factor1:Factor2"#Fixed effects formula
raneff_form = fixeff_form #Random effects formula
fixeff = np.asarray(patsy.dmatrix(fixeff_form, LMEdata)) #FE design matrix
raneff = np.asarray(patsy.dmatrix(raneff_form, LMEdata)) #RE design matrix
prior_intercept = np.asarray([6.15,.3])#prior for intercept: mu and sigma
priors_mu = np.repeat(0, 8) #Priors on mu for FE
priors_sigma = np.repeat(.4, 8) # priors on sigma for FE
priors_raneff = [0, .4] #Priors on RE
prior_sd = [0, .4] #priors on residual sigma
LME_data = dict(
N = len(LMEdata),
P = fixeff.shape[-1], #number of pop level effects
J = len(LMEdata.participant.unique()),
n_u = raneff.shape[-1],
subj = LMEdata.participant,
X = fixeff,
Z_u = raneff,
y = LMEdata.obs.get_values(),
p_intercept = prior_intercept, p_sd = prior_sd, p_fmu = priors_mu, p_fsigma = priors_sigma, p_r = priors_raneff
)
fit = priors_LME.sampling(data=LME_data, iter=1000, chains=1, warmup =0)
Behavior is the same whether I use the Fixed_param
algorithm or not, when I include some warm-up I get some estimation but the distribution of these estimations is far more narrower than the given priors.
P.-S I know that this method is sub-optimal but using the generated quantities makes it complicated as there are also transformed parameters and other generated quantities in the full Stan model