Calculate pointwise log-likelihood in multilevel models with measurement error and identifiability hacks

I am not sure how to calculate the pointwise log-likelihood – required to use loo – for a multilevel model with measurement error and identifiability hacks.

Let’s consider the following varying intercept model, with measurement error on the outcome y, and “postsweeping of random effects” to ensure identifiability of the intercepts (see solution 4 in Ogle & Barber, 2020):

\begin{aligned} y_{\text{obs}[i]} &\sim \text{Normal}(y_{\text{true}[i]}, y_{\text{err}}) && \text{for $i$ in 1:$N$ (number of observations)}\\ y_{\text{true}[i]} &\sim \text{Normal}(\mu, \sigma)\\ \mu_i &= \alpha_0 + \alpha_{\text{group[i]}} + \beta x_i\\ \alpha_\text{group} &\sim \text{Normal}(0, \sigma_{\alpha}) && \text{non-identifiable group intercepts} \\ \alpha_0 &\sim \text{Normal}(0, 1) && \text{non-identifiable global intercept}\\ \alpha_{group}^\star &= \alpha_\text{group} - \overline{\alpha}, \text{where } \overline{\alpha} = \frac{1}{G} \sum_{g = 1}^{G} \alpha_g && \text{identifiable group intercepts}\\ \alpha_0^\star &= \alpha_0 + \overline{\alpha} && \text{identifiable global intercept}\\ \beta &\sim \text{Normal}(0, 1)\\ \sigma_{\alpha}, \sigma &\sim \text{Exponential}(1)\\ \end{aligned}

I have boldly tried the following Stan code, but I don’t really know what I am doing with the _lpdf so I thought I would double-check here…

data {
	int<lower=0> N;              // number of observations
	int<lower=0> G;              // number of groups
	array[N] int<lower=1> group; // vector of groups
	vector[N] y_obs;             // observed outcome
	vector[N] x;                 // predictor
	real y_err;                  // measurement error on y_obs
}

parameters {
	real alpha0;               // global intercept
	vector[G] alpha;           // group intercepts
	real beta;                 // slope
	real<lower=0> sigma_alpha; // sd of group intercepts
	real<lower=0> sigma;       // sd of true outcome
	vector[N] y_true;          // true outcome
}

model {
	
	// model for y_true
	// adaptive prior
	alpha ~ normal(0, sigma_alpha); // non-identifiable group intercepts
	// hyper-prior
	sigma_alpha ~ exponential(1);
	// other priors
	alpha0 ~ std_normal();            // non-identifiable global intercept
	beta ~ std_normal();
	sigma ~ exponential(1);
	// likelihood
	y_true ~ normal(alpha0 + alpha[group] + beta * x, sigma);
	
	// model for y_obs
	y_obs ~ normal(y_true, y_err);

}

generated quantities {

	// identifiable parameters ("postsweeping of random effects", i.e., solution 4 from Ogle & Barber, 2020)
	real alpha_bar = mean(alpha);
	vector[G] alpha_star = alpha - alpha_bar; // identifiable group intercept
	real alpha0_star = alpha0 + alpha_bar;    // identifiable global intercept

	// posterior predictive distribution for replications y_obs_rep of the original data set y_obs given model parameters
	array[N] real y_true_rep_star = normal_rng(alpha0_star + alpha_star[group] + beta * x, sigma);
	array[N] real y_obs_rep_star = normal_rng(y_true_rep_star, y_err);

	// pointwise log-likelihood
    vector[N] yt;	
    vector[N] log_lik;
	for (i in 1:N) {
		yt[i] = normal_lpdf(y_true[i] | alpha0_star + alpha_star[group[i]] + beta * x[i], sigma);
		log_lik[i] = normal_lpdf(y_obs[i] | yt[i], y_err);
	}

}

I think in this case you might initially think to to give the pointwise log likelihood as

normal_lpdf(y_true[i] | alpha0 + alpha[group[i]] + beta * x[i,], sigma) + 
   normal_lpdf(y_obs[i] | y_true[i], y_err);

But that won’t work well with loo because it depends on the observation-specific parameter y_true. Fortunately we can get rid of the y_true here with

normal_lpdf(y_true[i] | alpha0 + alpha[group[i]] + beta * x[i,], sqrt(sigma^2, y_err^2));

I’m pretty sure that’s the pointwise log-likelihood that you want, but maybe @martinmodrak might be kind enough to double-check with his perspective. I don’t think that the post-sweeping matters at all to the likelihood.

1 Like

Thanks @jsocolar! Two questions:

  1. for your second suggestion, you talk about getting rid of y_true, did you mean normal_lpdf(y_obs[i] | alpha0 ...)?
  2. for the standard deviation, did you mean sqrt(sigma^2 + y_err^2)?
1 Like

Yes to both! Sorry for rushing.

1 Like