Hi,
I am trying to fit a hierarchical model in stan with a multivariate outcome.
The data has a panel structure. There are N individuals, denoted i. For each i, I observe T periods, denoted t. So, observations are indexed it. Each period t, I observe three univariate outcomes about i that sum to 1: \{y_{it1}, y_{it2}, y_{it3}\}.
The data generating process is:
y_{it1} = \mathbf{x_{it}'\beta_i} + \epsilon_{it1}
y_{it2} = \mathbf{x_{it}'\beta_i} + \epsilon_{it2}
y_{it3} = \mathbf{x_{it}'\beta_i} + \epsilon_{it3}
\epsilon_{itj} \sim_{iid} N(0, \sigma^2)
y_{it1} + y_{it2} + y_{it3} = 1
x_{it} (observed data) is 1 \times K and \beta_i (parameter to be estimated) is K \times 1. I assume \beta_i \sim MVN(\beta, \Sigma), with hyperpriors on \beta and \Sigma.
I have pasted below my current stan code, which uses a non-centered parameterization. My question is: how can I modify this to include the y_{it1} + y_{it2} + y_{it3} = 1 constraint?
data {
int N_obs; // total number of observations
int N_pts; // total number of i's
int K; // number of predictors
int pid[N_obs]; // vector of i id's
matrix[N_obs, K] x; // matrix of predictors
real y[N_obs]; // outcome vector
}
parameters {
matrix[K, N_pts] z_p;
vector<lower=0>[K] sigma_p;
vector[K] beta;
cholesky_factor_corr[K] L_p;
real<lower=0> sigma;
}
transformed parameters {
matrix[K, N_pts] z;
z = diag_pre_multiply(sigma_p, L_p) * z_p;
}
model {
vector[N_obs] mu;
// prior
beta ~ normal(1, 1);
sigma ~ normal(0, 5);
sigma_p ~ normal(0, 5);
L_p ~ lkj_corr_cholesky(2);
to_vector(z_p) ~ normal(0, 1);
// likelihood
for(i in 1:N_obs) {
mu[i] = x[i,] * (beta + z[,pid[i]]);
}
y ~ normal(mu, sigma);
}
Thanks,
Kim