# Missing parameters and priors

Hi all! I am trying to estimate different amounts of missing parameters for different subjects. Since there is not an easy way to work with ragged arrays in Stan, I tried to be creative and came up with the following solution:

``````data {
int T;                                        // Number of time points
int N;                                       // Number of subjects
int<lower=1> T_obs[N];          // Number of time points WITH data
int<lower=0> T_mis[N];          // Number of time points WITHOUT data
int<lower=0> idx_obs[N,T];    // Indices of time points WITH data
int<lower=0> idx_mis[N,T];    // Indices of time points WITHOUT data
}
parameters {
// missed + observed
vector[T] theta_obs[N];
vector[T] theta_mis[N];
}
transformed parameters {
vector[T] theta_pr[N];
vector[T] theta[N];

// Assign missing and observed data
for (n in 1:N) {
theta_pr[n, idx_obs[n,1:T_obs[n]]] = theta_obs[n,1:T_obs[n]];
if (T_mis[n] != 0) {
theta_pr[n, idx_mis[n,1:T_mis[n]]] = theta_mis[n,1:T_mis[n]];
}
}

for (i in 1:N) {
theta[i] = exp(theta_pr[i]);
}

model {
for (n in 1:N) {
for (t in 1:t) {
theta_pr[n,t] ~ normal(mu, sigma);
y[n,t] ~ bernoulli(theta);
}
``````

I don’t put any priors on theta_obs and theta_mis, and I sample only theta_pr and theta. Is it kosher to use theta_mis and theta_obs only for assignment??

Sure you can do that.
Besides that, parameters are sampled.

Ok, thanks for validation!

When the model compiles, I get the following warning:

``````Left-hand side of sampling statement (~) may contain a non-linear transform of a parameter or local variable.
If it does, you need to include a target += statement with the log absolute determinant of the Jacobian of the transform.
Left-hand-side of sampling statement:
theta_pr[[n, t]] ~ normal(...)
``````

Can I really ignore it if there is no non-linear transform?

Finally, while I get reasonable outputs for theta_pr and theta, theta_mis and theta_obs give me very large Rhat… Any ides why/how to solve this?

What it’s the use for `theta`? It’s just there, but never used.
If you want to put priors, then on `parameters` not `transformed parameters`.

Think about transformed parameters as helping temporary calculations done with parameters and data
and it’s values are reported.

I estimate `theta` with a bernoulli function from behavioral task data and I build a linear Gaussian model over the estimated `theta` parameters - because of some missing data, the missing `theta` values must be estimated with the linear Gaussian model
I re-parametrize `theta` to be the exp of `theta_pr` - `theta = exp(theta_pr)` in the `transformed parameters` block.
Is it completely wrong to put priors on `transformed parameters`? When I try to declare `theta_pr` in the `parameters` block and then assign missing values in the `transformed parameters` block, I get an error

``````Cannot assign to variable outside of declaration block; left-hand-side variable origin=parameter
Illegal statement beginning with non-void expression parsed as
theta_pr[n, idx_obs[n, 1:T_obs[n]]]
Not a legal assignment, sampling, or function statement.  Note that
* Assignment statements only allow variables (with optional indexes) on the left;
* Sampling statements allow arbitrary value-denoting expressions on the left.
* Functions used as statements must be declared to have void returns
``````

I get the same error if I assign missing data in the `parameters` block :(

This implies that you believe that all values from `-Inf` to `Inf` are equally credible for at least the missing values (`theta_obs` might be constained by other parts of the model), which doesn’t seem realistic.

Can you post the full model? I’m having trouble following what you’re trying to achieve here.

I updated my first post to include the model block.

Can you update the `data` block too? For example, `y` is missing.

It’s not wrong. But if you sample from a transformed parameter and the transformation is a non-linear transformation, then you need to add a jacobian adjustment.

You cannot assign something in `parameters` block. This block is only for declaring parameters.

`]]] = ` are closed, but only two open ones `[`. Should be two?

There are many things missing :) the full code is REALLY long, so I tried to keep only the essentials.

But that’s the point - i don’t even estimate full `theta_obs` because of missing data and I use `theta_obs` (and `theta_mis`) only to create equal size arrays that I can assign from to `theta_pr` (obs+mis).
So is it right to put priors on parameters that I am not even estimating?

Right, and I appreciate that this is often best practice for helping volunteers get to the core of the question you’re asking. In this case, I suspect you’re not understanding how to do missing data imputation correctly and want to check more thoroughly so I can possibly prove myself wrong or give you better advice if right.

Am I missing something…? It looks correct to my eyes. Thanks for the other answers!

1 Like

Here: (and thanks!)

``````data {
int T;                              // Number of time points
int N;                              // Number of subjects
int Xdim
int<lower=1> T_obs[N];              // Number of time points WITH data
int<lower=0> T_mis[N];              // Number of time points WITHOUT data
int<lower=0> T_obs[N,T];            // Indices of time points WITH data
int<lower=0> T_mis[N,T];            // Indices of time points WITHOUT data
int<lower=1> P;                  // Number of parameters
int<lower=0> T_max;                 // Max number of trials across subjects across time points
int<lower=0> Tr[N, T];              // Number of trials for each subj for each time points
int y[N, T, T_max];
}

transformed data {
vector[Xdim] Q;
Q = rep_vector(1.0, Xdim);
}

parameters {
vector[Xdim] mu_prior_x;
vector<lower=0>[Xdim] sigma_x;
real<lower=0> sigma_r;

matrix[Xdim, Xdim] A;
matrix[P, Xdim] C;

vector[Xdim] X[N,T];

// missed + observed
vector[T] theta_obs[N];
vector[T] theta_mis[N];
}

transformed parameters {
vector[T] theta_pr[N];
vector[T] theta[N];

for (n in 1:N) {
theta_pr[n, idx_obs[n,1:T_obs[n]]] = theta_obs[n,1:T_obs[n]];
if (T_mis[n] != 0) {
theta_pr[n, idx_mis[n,1:T_mis[n]]] = theta_mis[n,1:T_mis[n]];
}
}

for (i in 1:N) {
theta[i] = exp(theta_pr[i]);
}

}

model {
sigma_v ~ cauchy(0,5);
sigma_x ~ cauchy(0,5);
mu_prior_x ~ normal(0,sigma_x);
sigma_r ~ cauchy(0,5);

for (p1 in 1:Xdim) {
A[p1,] ~ cauchy(0,5);
}
for (p1 in 1:P) {
C[p1,] ~ cauchy(0,5);
}

for (n in 1:N) {
for (t in 1:T) {

if (t == 1) {
X[n,t] ~ normal(mu_prior_x,sigma_v);
}
else {
X[n, t] ~  normal((A * X[n,t-1]), Q);
}

theta_pr[n,t] ~ normal(C*X[n,t],sigma_r);

if (idx_obs[n,t] != 0) {

for (t1 in 1:Tr_nc[n,t]) {
y[n,t,t1] ~ bernoulli(theta[n,t]);
}
}

}
}
}

"""``````

Thanks! I hate to be a pain, but one more thing will help me understand your aims and decide whether I’m on the right track in understanding how to advise: can you post a model that represents what you’d do if you didn’t have any missing data to deal with?

This would be it:

``````data {
int T;                              // Number of time points
int N;                              // Number of subjects
int Xdim;
int<lower=1> P;                  // Number of parameters
int<lower=0> T_max;                 // Max number of trials across subjects across time points
int<lower=0> Tr[N, T];              // Number of trials for each subj for each time points
int y[N, T, T_max];
}

transformed data {
vector[Xdim] Q;
Q = rep_vector(1.0, Xdim);
}

parameters {
vector[Xdim] mu_prior_x;
vector<lower=0>[Xdim] sigma_x;
real<lower=0> sigma_r;
vector[T] theta_pr[N];

matrix[Xdim, Xdim] A;
matrix[P, Xdim] C;

vector[Xdim] X[N,T];
}

transformed parameters {
vector[T] theta[N];

for (i in 1:N) {
theta[i] = exp(theta_pr[i]);
}
}

model {
sigma_v ~ cauchy(0,5);
sigma_x ~ cauchy(0,5);
mu_prior_x ~ normal(0,sigma_x);
sigma_r ~ cauchy(0,5);

for (p1 in 1:Xdim) {
A[p1,] ~ cauchy(0,5);
}
for (p1 in 1:P) {
C[p1,] ~ cauchy(0,5);
}

for (n in 1:N) {
for (t in 1:T) {

if (t == 1) {
X[n,t] ~ normal(mu_prior_x,sigma_v);
}
else {
X[n, t] ~  normal((A * X[n,t-1]), Q);
}

theta_pr[n,t] ~ normal(C*X[n,t],sigma_r);

for (t1 in 1:Tr_nc[n,t]) {
y[n,t,t1] ~ bernoulli(theta[n,t]);
}
}
}
}

"""``````

Awesome, that helps. So, I could be still misunderstanding your model, but I actually don’t think you have any missingness to necessitate inference on at all. Data-wise, you seem to have multiple subjects each of which measured possibly multiple times at each of multiple time points, but with some subjects missing data for some time points. Data are 1s/0s so modelled as bernoulli outcomes with some auto-regressive stuff for the influence of time nested within each participant. Since the data are ragged (i.e. there’s NAs in some entries in `y` thanks to the some participants not having fully `T_max` trials per timepoint), you have what looks to be the beginnings of handling that data structure, but you haven’t actually completed that part yet (or perhaps you have, you just have `Tr_nc` where you should have `Tr` in the model block I think). Have you literally run the no-missingness-imputation model and encountered an issue? Because as it is written, even if a participant is missing data for a given timepoint entirely, the model still expresses a theta_pr for that participant-timepoint combination, and non-missing data plus your priors and the structure of your model will constrain that value.

You understood almost everything correctly! The no-missingness model runs fine; so does the missingness model. In the model with missing data I was not sure if to put priors on `theta_obs` and `theta_mis`, because I only use these parameters only for assignment to `theta_pr` (mis+obs).

I’m saying you don’t need a model with `theta_obs` or `theta_mis` in it at all. Indeed, adding those parameters unnecessarily in the way you have can only introduce error.

Now I am confused. How would I assign missing parameters? Imagine one subject has times points [1 2 3 5] with 4 missing. I must impute the 4th time point.