Hi,
I am trying to model a seemingly unrelated regression (SUR) model with censored dependent variables. I also need to include data augmentation in the model. I have started with a model, which is a combination of the example SUR model in Stan manual (chapter 9.15.) and this censored probit example I found in SO:
The answer to that post advises not to use data augmentation for a binary probit as it is unnecessary. I wonder whether the same applies to continuous censored model as well? Here is the data augmentation mapping I need:
if (ystar[K, N] > 0) {
y[K, N] = ystar[K, N]/sum(ystar[N])
}
if (ystar[K, N] <= 0)
y[K, N] = 0
}
Would that be equivalent to that probit solution with Phi() function or do I need to model that explicitly? Thus far I have tried my model with that Phi() version only. However, I am not sure whether I manage to get that right. Here is my Stan model code thus far:
data {
int<lower=1> K;
int<lower=1> J;
int<lower=0> N;
vector[J] x[N];
vector<lower=0, upper=1>[K] y[N];
}
parameters {
matrix[K, J] beta;
cholesky_factor_corr[K] L_Omega;
vector<lower=0>[K] L_sigma;
}
model {
vector[K] mu[N];
matrix[K, K] L_Sigma;
for (n in 1:N) {
mu[n] = beta*x[n];
/* mu[n] = Phi(mu[n]); */ // 1. attempt
/* for (k in 1:K) */ // 2. attempt
/* mu[k, n] = Phi(mu[k, n]); */
}
L_Sigma = diag_pre_multiply(L_sigma, L_Omega);
to_vector(beta) ~ normal(0,5);
L_Omega ~ lkj_corr_cholesky(4);
L_sigma ~ cauchy(0,.25);
y ~ multi_normal_cholesky(mu, L_Sigma);
}
I have had two attempts thus far (see the comments in the code). The first takes way too much time even with a small subset of data, and the second, which my intuition would favor, gives an error of nans found. Do you have any suggestions of how to proceed? I could provide some code to construct fake data if needed.