# Covariance between item means for unbalanced repeated measurements

Hi all,

First off, I’m fairly new to stan, so the answer to my question may be fairly obvious… to others. I’ve been working on this for a few days now, so I figured it was worth a shot to ask around for some help.

I’ve created a model for estimating two different item scores that each contain repeated measurements. This design is unbalanced, such that a given individual has a different number of measurements for the first and second items. I work with psychophysiological data, so unbalanced measurements for items (trial types) is very commonplace after rejecting “bad” measurements with too much noise/artifact.

Right now, this model estimates the within-person standard deviations (sig_err) for each item as well as the between-person standard deviations (sig_id) for each item.

I’ve been trying to build off this model to compute the between-person covariance for the mean item scores (u_1, u_2). I have been pretty unsuccessful. Does anyone have any recommendations about how I can go about computing covariance between u_1 and u_2 in this model?

Any help would be greatly appreciated!

Thanks,
Peter

stan_model <- "
data {
int<lower=1> NOB1; //total observations of first measurement
int<lower=1> NOB2; //total observations of second measurement
int<lower=1> NSUB; //total number of subjects

int<lower=1, upper=NSUB> id1[NOB1]; //grouping id for first meas
int<lower=1, upper=NSUB> id2[NOB2]; //grouping id for second meas

vector[NOB1] meas1; //scores for first measurement
vector[NOB2] meas2; //scores for second measurement
}

parameters {
vector[2] mu; //intercepts for meas1 and meas2

vector<lower=0>[2] sig_id; //bet-person std dev for given meas
vector<lower=0>[2] sig_err; //within-person std dev for given meas

vector[NSUB] id_raw1;
vector[NSUB] id_raw2;
}

transformed parameters {
vector[NSUB] u_1; //predicted meas1 score
vector[NSUB] u_2; //predicted meas2 score

u_1 = mu[1] + id_raw1sig_id[1];
u_2 = mu[2] + id_raw2
sig_id[2];
}

model {
id_raw1 ~ normal(0,1);
id_raw2 ~ normal(0,1);

mu ~ normal(0,10);

sig_id ~ cauchy(0,10);
sig_err ~ cauchy(0,10);

meas1 ~ normal(u_1[id1],sig_err[1]);
meas2 ~ normal(u_2[id2],sig_err[2]);
}
"

What do you mean by covariance between `u_1` and `u_2`? Do you want the posterior covariance or do you want to impose some kind of covariance prior?

It’s not quite that `sig_err` is the between-person standard deviations—it’s estimating a scale parameter for a model. The posterior mean for `sig_err` isn’t guaranteed to be the sample standard deviation, if that’s what you were thinking.

Hi Bob,

Thanks for the response.

I conceptualize u_1 and u_2 as the estimated subject-specific means for each item. Am I not thinking about this correctly? I’m interested in the posterior covariance between u_1 and u_2. I don’t need to impose a covariance prior.

I understand that the posterior mean for sig_err isn’t guaranteed to be the sample standard deviation. And, I plan on adding more terms to the model that will change the meaning of sig_err. I’m just trying to start with as simple of a model of possible.

Thanks again for the help.

There’s no way to directly computer posterior covariance within Stan—the modeling language purely defines random variables. So you need to literally just take the posterior draws and compute the covariance for variables of interest. Also, there’s MCMC uncertainty in that calculation based on the effective sample size.

We tend to do more scatterplotting and visual inspection of the posterior, as covariance is just a linear summary of the true relationship.

Hm, well I guess that’s why I’ve been so unsuccessful when I’ve tried to compute the posterior covariance. Thank you very much for the clarification, Bob. I appreciate it.

Also, I should’ve mentioned that given that

``````cov[X, Y] = E[X * Y] - E[X] * E[Y]
``````

you can store the the value of `X * Y` as a generated quantity and compute everything afterwards using the posterior means (the `E` terms).

Oh, perfect. Thanks for the recommendation. I haven’t played around with the generated quantities section, so this will give me an excuse to learn a new part of Stan. I’ll go ahead and give that a shot.

Thanks again.