I am new to stan so I apologize ahead of time for simplicity of my question. I am trying to understand the way stan handles missing data imputation, specifically how it merges imputed data into observed data. I have built a regression model with missing data on the dependent variable (DV). All predictors are observed. Conceptually, I understand that I am using parameters in the model to estimate missing values and then merging these into observed DV.

```
data {
int<lower=0> N;
int anxiety_num_missing;
real Anxiety[N];
vector [N] Tx;
vector [N] Time;
vector [N] TxT;
int anxiety_missing[N];
}
parameters{
real alpha;
real<lower=0> sigma;
real bN;
real bM;
real bI;
vector[anxiety_num_missing] anxiety_impute;
real mu_anxiety;
real<lower=0> sigma_anxiety;
}
model {
vector[N] mu;
vector[N] anxiety_merged;
alpha ~ normal(0,10);
bN ~ normal(0,10);
bM ~ normal(0,10);
bI ~ normal(0,10);
mu_anxiety ~ normal(3.5,1);
sigma ~ cauchy(0,1);
sigma_anxiety ~ cauchy(0,1);
//imputation
anxiety_merged ~ normal(mu_anxiety, sigma_anxiety);
//regression
mu = alpha + bN*Tx + bM*Time + bI*TxT;
Anxiety ~ normal(mu, sigma);
//merge missing and observed
for (i in 1:N){
anxiety_merged[i] = Anxiety[i];
if (anxiety_missing[i] > 0) anxiety_merged = anxiety_impute[anxiety_missing[i]];
}
}
```

This code throws the following error:

Dimension mismatch in assignment; variable name = anxiety_merged, type = vector; right-hand side type = real.

Illegal statement beginning with non-void expression parsed as

anxiety_merged

Not a legal assignment, sampling, or function statement. Note that

- Assignment statements only allow variables (with optional indexes) on the left;
- Sampling statements allow arbitrary value-denoting expressions on the left.
- Functions used as statements must be declared to have void returns

which I understand means that **anxiety_merged** is vector and can’t be on the right side of the merging step, but am not sure how to code around it? Any help would be appreciated.