I am running the model shown below to fit longitudinal data with multivariate skew-normal random effects and t-distributed errors. When I extract the random generated samples (y_tilde) in order to examine the posterior predictive distribution, I am finding some outlandish values and I am curious what might be causing this. The model is converging and there are no indication of divergent transitions or any other issues. However, for some data points I am getting random samples well beyond what would be expected. For instance, for one observation, the posterior predictive distribution given the specific covariate values and estimated random effects is:
mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat
22.90 0.09 8.91 8.10 18.49 22.86 27.27 38.38 10368.83 1.00
However, for one iteration the randomly sampled value is -398.6698. At this iteration, the y_hat = 21.9732, nu = 3.328337, and sigma_e = 4.754296. Even using the rt function in R, the minimum from 100,000 random samples (i.e., min(y_hat + rt(1e5, nu)*sigma_e)) is only -121.6858.
Any ideas as to what the issue might be?
Thank you all immensely for your help!
data{
int<lower = 1> N; // Number of patients
int<lower = N> Total; // Total number of observations
int<lower = 1, upper = N> ID[Total]; // Patient ID for each observation
vector<lower = 0>[Total] y_obs; // Observed response
vector[Total] Time; // Time of longitudinal measurement
vector[Total] Cov1; // Covariate 1
vector[Total] Cov2; // Covariate 2
vector[Total] Cov3; // Covariate 3
vector[Total] Cov4; // Covariate 4
}
parameters{
real<lower = 0> nu; // Degress of fredoom
real beta0_raw;
real beta1_raw;
real beta2_raw;
real beta3_raw;
real beta4_raw;
real beta5_raw;
real beta6_raw;
vector[2] alpha_raw; // Skewness parameter
matrix[N, 2] b_i_raw; // Random effects
real<lower = 0> sigma_e; // Std. dev. of residual
vector<lower = 0>[2] sigma_b; // Std. dev. of random effects
cholesky_factor_corr[2] L_b; // Cholesky factor of random effects
}
transformed parameters{
vector[Total] y_hat; // Expected value
real beta0;
real beta1;
real beta2;
real beta3;
real beta4;
real beta5;
real beta6;
vector[2] alpha;
vector[2] b_i[N];
alpha = [10, 0]' + [10, 5]' .* alpha_raw;
for(n in 1:N){
b_i[n] = sigma_b .* (L_b * (b_i_raw[n]'));
}
beta0 = 10*beta0_raw;
beta1 = 10*beta1_raw;
beta2 = 5*beta2_raw;
beta3 = 5*beta3_raw;
beta4 = 5*beta4_raw;
beta5 = 5*beta5_raw;
beta6 = 5*beta6_raw;
for(n in 1:Total){
y_hat[n] = b_i[ID[n], 1] + beta0 +
(b_i[ID[n], 2] + beta1)*(Time[n]) +
beta2*Cov1[n] +
beta3*Cov2[n] +
beta4*Cov3[n] +
beta5*Cov4[n] +
beta6*(Cov3[n] * Time[n]);// +
}
}
model{
L_b ~ lkj_corr_cholesky(1);
nu ~ lognormal(0, 1);
beta0_raw ~ normal(0, 1);
beta1_raw ~ normal(0, 1);
beta2_raw ~ normal(0, 1);
beta3_raw ~ normal(0, 1);
beta4_raw ~ normal(0, 1);
beta5_raw ~ normal(0, 1);
beta6_raw ~ normal(0, 1);
sigma_b ~ lognormal(0, 1);
sigma_e ~ lognormal(0, 1);
alpha_raw ~ normal(0, 1);
to_vector(b_i_raw) ~ normal(0, 1);
for(n in 1:N){
target += normal_lcdf(dot_product(alpha, b_i_raw[n]) | 0, 1);
}
for(n in 1:Total){
y_obs[n] ~ student_t(nu, y_hat[n], sigma_e);
}
}
generated quantities{
vector[Total] y_tilde;
for(i in 1:Total){
y_tilde[i] = student_t_rng(nu, y_hat[i], sigma_e);
}
}