Hello,
I would like to fit a “simple” linear model, with quadratic terms, where some predictors are missing.
To setup the model in Stan, I’m using the following simulated data, as simple as possible.
N=30
MIS=10
X<-rnorm(N,2,1);
y<-rnorm(N,X^2,0.5);
df<-data.frame(
X=X, # rows corresponding to I_obs=0 included for simplicity, but not used in stan
I_obs=c(rep(0,MIS),rep(1,N-MIS)),
y=y
)
It is clear that for each y value, there are two compatible X values, and I think the correct probability distribution for missing values is in many cases bimodal (each peak centered around \pm\sqrt{y[i]}, with more weight around +2 since the observed Xs are centered around 2.
In fact running the following model, for several X_mis variables, some chains converge around the postiive value, some chains around the negative one, but I don’t think this is the right approach, since I get all kinds of R^ warnings etc…
Any suggestion to tackle this problem?
thanks
Matteo
functions {
vector merge(array[] int I_obs,
vector X,
vector X_mis
)
{
int N=dims(X)[1];
vector[N] X_merge=X;
int k=0;
for (n in 1:N)
{
if (I_obs[n]==0)
{
k=k+1;
X_merge[n]=X_mis[k];
}
}
return(X_merge);
}
matrix expand( vector X) {
int N=dims(X)[1];
matrix[N, 2] X_full;
X_full[,1]=X;
X_full[,2]=X.*X;
return(X_full);
}
}
data {
int<lower=0> N;
vector[N] y;
vector[N] X;
array[N] int I_obs;
int<lower=0, upper=1> regress; // 0 -> no regression, 1-> regression
}
transformed data {
int totObs=0;
for (n in 1:N)
{
totObs=totObs+I_obs[n];
}
}
parameters {
vector[2] b;
real muX;
real<lower=1e-10> sigmaX;
vector [N-totObs] X_mis;
real<lower=1e-10> sigmaY;
}
model {
muX~normal(0,1);
sigmaX~normal(0,1);
b~normal(0,1);
sigmaY~normal(0,1);
vector [N] X_merge;
X_merge=merge(I_obs, X, X_mis);
X_merge~normal(muX,sigmaX);
if (regress)
y~normal(expand(X_merge)*b,sigmaY);
}