Calculate pointwise log-likelihood in multilevel models with measurement error and identifiability hacks

I am not sure how to calculate the pointwise log-likelihood – required to use loo – for a multilevel model with measurement error and identifiability hacks.

Let’s consider the following varying intercept model, with measurement error on the outcome y, and “postsweeping of random effects” to ensure identifiability of the intercepts (see solution 4 in Ogle & Barber, 2020):

\begin{aligned} y_{\text{obs}[i]} &\sim \text{Normal}(y_{\text{true}[i]}, y_{\text{err}}) && \text{for $i$ in 1:$N$ (number of observations)}\\ y_{\text{true}[i]} &\sim \text{Normal}(\mu, \sigma)\\ \mu_i &= \alpha_0 + \alpha_{\text{group[i]}} + \beta x_i\\ \alpha_\text{group} &\sim \text{Normal}(0, \sigma_{\alpha}) && \text{non-identifiable group intercepts} \\ \alpha_0 &\sim \text{Normal}(0, 1) && \text{non-identifiable global intercept}\\ \alpha_{group}^\star &= \alpha_\text{group} - \overline{\alpha}, \text{where } \overline{\alpha} = \frac{1}{G} \sum_{g = 1}^{G} \alpha_g && \text{identifiable group intercepts}\\ \alpha_0^\star &= \alpha_0 + \overline{\alpha} && \text{identifiable global intercept}\\ \beta &\sim \text{Normal}(0, 1)\\ \sigma_{\alpha}, \sigma &\sim \text{Exponential}(1)\\ \end{aligned}

I have boldly tried the following Stan code, but I don’t really know what I am doing with the _lpdf so I thought I would double-check here…

data {
int<lower=0> N;              // number of observations
int<lower=0> G;              // number of groups
array[N] int<lower=1> group; // vector of groups
vector[N] y_obs;             // observed outcome
vector[N] x;                 // predictor
real y_err;                  // measurement error on y_obs
}

parameters {
real alpha0;               // global intercept
vector[G] alpha;           // group intercepts
real beta;                 // slope
real<lower=0> sigma_alpha; // sd of group intercepts
real<lower=0> sigma;       // sd of true outcome
vector[N] y_true;          // true outcome
}

model {

// model for y_true
alpha ~ normal(0, sigma_alpha); // non-identifiable group intercepts
// hyper-prior
sigma_alpha ~ exponential(1);
// other priors
alpha0 ~ std_normal();            // non-identifiable global intercept
beta ~ std_normal();
sigma ~ exponential(1);
// likelihood
y_true ~ normal(alpha0 + alpha[group] + beta * x, sigma);

// model for y_obs
y_obs ~ normal(y_true, y_err);

}

generated quantities {

// identifiable parameters ("postsweeping of random effects", i.e., solution 4 from Ogle & Barber, 2020)
real alpha_bar = mean(alpha);
vector[G] alpha_star = alpha - alpha_bar; // identifiable group intercept
real alpha0_star = alpha0 + alpha_bar;    // identifiable global intercept

// posterior predictive distribution for replications y_obs_rep of the original data set y_obs given model parameters
array[N] real y_true_rep_star = normal_rng(alpha0_star + alpha_star[group] + beta * x, sigma);
array[N] real y_obs_rep_star = normal_rng(y_true_rep_star, y_err);

// pointwise log-likelihood
vector[N] yt;
vector[N] log_lik;
for (i in 1:N) {
yt[i] = normal_lpdf(y_true[i] | alpha0_star + alpha_star[group[i]] + beta * x[i], sigma);
log_lik[i] = normal_lpdf(y_obs[i] | yt[i], y_err);
}

}


I think in this case you might initially think to to give the pointwise log likelihood as

normal_lpdf(y_true[i] | alpha0 + alpha[group[i]] + beta * x[i,], sigma) +
normal_lpdf(y_obs[i] | y_true[i], y_err);



But that won’t work well with loo because it depends on the observation-specific parameter y_true. Fortunately we can get rid of the y_true here with

normal_lpdf(y_true[i] | alpha0 + alpha[group[i]] + beta * x[i,], sqrt(sigma^2, y_err^2));


I’m pretty sure that’s the pointwise log-likelihood that you want, but maybe @martinmodrak might be kind enough to double-check with his perspective. I don’t think that the post-sweeping matters at all to the likelihood.

1 Like

Thanks @jsocolar! Two questions:

1. for your second suggestion, you talk about getting rid of y_true, did you mean normal_lpdf(y_obs[i] | alpha0 ...)?
2. for the standard deviation, did you mean sqrt(sigma^2 + y_err^2)?
1 Like

Yes to both! Sorry for rushing.

1 Like