Hi,

I’m trying to figure out the appropriate log likelihood formulation for a model in which the response is uncertain. Further, I’m using a non-centered parameterization to estimate the “true” value of the response as my parameter estimates using the centered formulation are expressing a form of Neal’s funnel.

The reason I ask is because my impression is that if I have a normal log likelihood, I should calculate that likelihood using the observed response (`y_meas`

), the expected value (`mu`

) and a scaling parameter (`sigma`

), whatever the expected “true” value (`y_true`

) is determined from the model, accounting for measurement uncertainty in the response (`y_noise`

). However, when I look at the distribution of log likelihoods for the results (or try to run them in the `{loo}`

package), I do not get values that make any sense. The only way I get likelihoods that seem reasonable is to compare `y_meas`

to `y_true`

and `y_noise`

.

Below is a toy linear model highlighting what I’m talking about. I have three scenarios. A) how I thought I was supposed to calculate the log likelihood in Stan, B) an alternative that includes the scaling factor on `sigma`

(helps prevent the “funnel”) – may be identical to A, but more explicit, and C) the parameterization that seems to produce the most reasonable results, but I don’t know why.

This is just a toy example (e.g., these data don’t necessitate a non-centered parameterization), but here is some code to produce some data, focusing a linear model for the data generation process:

```
library(rstan)
library(tidybayes)
library(tidyverse)
set.seed(20821)
#Number of samples
n <- 30
#Some explanatory
x <- rnorm(n, mean = 0, sd = 5)
#Intercept
alpha <- 0.5
#Slope
beta <- 2.2
#Relationship uncertainty
epsilon <- rnorm(n = n, mean = 0, sd = 1)
#Perfect fit between response and explanatory
y_perfect <- alpha + beta * x
#Fit after including relationship uncertainty
y_true <- alpha + beta * x + epsilon
#Measurement noise (equivalent to standard error)
y_noise <- rgamma(n, shape = 10)
#Measurement error (and fit uncertainty)
y_meas <- rnorm(n, mean = y_true, sd = y_noise)
#Model data
stan_data_uncertain <- list(N = n,
y_meas = y_meas,
y_noise = y_noise,
x = x)
```

And the Stan code (with the generated quantities block highlighting the three scenarios):

```
data {
int<lower=0> N; //Number of observations
vector[N] y_meas; //Observed response
vector<lower=0>[N] y_noise; //Standard error of response
vector[N] x; //Explanatory matrix
}
parameters {
real alpha; //Intercept
real beta; //Slope
real<lower=0> z_sigma; //Scales non-centered standard deviation
real<lower=0> sigma; //Standard deviation
}
transformed parameters {
vector[N] mu; //Expected response from model
vector[N] y_true; //The "true" response value
mu = alpha + beta * x; //Model
y_true = mu + z_sigma * sigma; //Non-centered estimate of y_true
}
model {
target += std_normal_lpdf(alpha); //Standard normal intercept prior
target += std_normal_lpdf(beta); //Standard normal slope prior
target += std_normal_lpdf(z_sigma); //Non-centered scaling factor for sigma
target += exponential_lpdf(sigma | 1); //Exponential prior for sigma
target += normal_lpdf(y_meas | y_true, y_noise); //Likelihood using latent
//variable and measurement uncertainty
}
generated quantities {
vector[N] pred_A; //Predictions from scenario A
vector[N] pred_B; //Predictions from scenario B
vector[N] pred_C; //Predictions from scenario C
vector[N] log_lik_A; //Log likelihood for scenario A
vector[N] log_lik_B; //Log likelihood for scenario B
vector[N] log_lik_C; //Log likelihood for scenario C
for(n in 1:N) {
pred_A[n] = normal_rng(mu[n], sigma);
pred_B[n] = normal_rng(mu[n], z_sigma * sigma);
pred_C[n] = normal_rng(y_true[n], y_noise[n]);
log_lik_A[n] = normal_lpdf(y_meas[n] | mu[n], sigma);
log_lik_B[n] = normal_lpdf(y_meas[n] | mu[n], z_sigma * sigma);
log_lik_C[n] = normal_lpdf(y_meas[n] | y_true[n], y_noise[n]);
}
}
```

Here is a visualization (and code; Stan model results are stored in `model_samples_uncertain`

) of the log likelihoods (note the different axis scales):

```
##Log likelihoods
gather_draws(model_samples_uncertain, log_lik_A[n], log_lik_B[n], log_lik_C[n]) %>%
ggplot(aes(.value)) +
geom_density() +
labs(x = NULL) +
facet_wrap(~ .variable, scales = "free", strip.position = "bottom") +
theme_bw() +
theme(strip.background = element_blank(),
strip.placement = "outside",
panel.grid.minor = element_blank()) +
scale_fill_brewer(palette = "Set1")
```

The first two scenarios seem pretty extreme, while the scenario that estimates the “true” response value and predetermined uncertainty look reasonable. Just to help confirm these whacky values, below are the predictions across scenarios, compared to the measured response values.

```
##Response predictions
gather_draws(model_samples_uncertain, pred_A[n], pred_B[n], pred_C[n]) %>%
group_by(n, .variable) %>%
summarise(y_pred = quantile(.value, probs = 0.5),
y_lo = quantile(.value, probs = 0.05),
y_hi = quantile(.value, probs = 0.95)) %>%
full_join(tibble(y_meas) %>%
mutate(n = 1:n())) %>%
ggplot(aes(y_pred, y_meas)) +
geom_abline(intercept = 0, slope = 1) +
geom_errorbar(aes(ymin = y_lo, ymax = y_hi)) +
geom_point(size = 3, shape = 21, fill = "gray", color = "black") +
labs(y = "Data generating response",
x = "Predicted response") +
facet_wrap(~ .variable) +
theme_bw() +
theme(strip.background = element_blank())
```

To be clear, the x-axis represents the predicted values from the three different scenarios, the y-axis is the data used to fit the model, the vertical lines are the 90th percent credible intervals, and the lines bisecting the plots are the 1:1 line. I’m not actually sure why the model is so confident about the results in the first two scenarios, but if we compare them to the “prefect” response values (no relationship or measurement uncertainties), the results actually make a lot of sense:

```
gather_draws(model_samples_uncertain, pred_A[n], pred_B[n], pred_C[n]) %>%
group_by(n, .variable) %>%
summarise(y_pred = quantile(.value, probs = 0.5),
y_lo = quantile(.value, probs = 0.05),
y_hi = quantile(.value, probs = 0.95)) %>%
full_join(tibble(y_perfect) %>%
mutate(n = 1:n())) %>%
ggplot(aes(y_pred, y_perfect)) +
geom_abline(intercept = 0, slope = 1) +
geom_errorbar(aes(ymin = y_lo, ymax = y_hi)) +
geom_point(size = 3, shape = 21, fill = "gray", color = "black") +
labs(y = "Data generating response",
x = "Predicted response") +
facet_wrap(~ .variable) +
theme_bw() +
theme(strip.background = element_blank())
```

So, I’m basically confused as to what I’m supposed to be comparing to what now.

Does anyone know how one is actually supposed to calculate the log likelihoods for a model like this?