# Calculate log-predictive density

Hi.

I am trying to calculate the log-predictive density for my predictive distributions. Before I have calculated the RMSE and the percentiles to evaluate my model, but I would also like to calculate the log-predictive density. But I do not know how I should specify it and where to put it. I have the following code for my stan model and my for loop

``````write("// Stan model for simple linear regression

data {
int < lower = 1 > N; // Sample size
// real<lower=0> nu;
vector[N] x1; // Predictor
vector[N] x2; // Predictor
vector[N] y; // Outcome

real x1_new;
real x2_new;
}

parameters {
real beta1; // Slope (regression coefficients)
real beta2; // Slope (regression coefficients)
real <lower = 0> sigma; // Error SD
real <lower = 0> nu_raw;
}

transformed parameters {
real < lower = 2 > nu;
nu = nu_raw + 2;
}

model {
nu_raw ~ gamma(2, 0.1);
y ~ student_t(nu, x1 + x2 * beta2, sigma);
}
generated quantities {
real y_new;
y_new = student_t_rng(nu, x1_new + x2_new * beta2, sigma);
}
// The posterior predictive distribution",

"stan_model1.stan")

stanc("stan_model1.stan")

stan_model1 <- "stan_model1.stan"

fit <- stan(file = stan_model1, data = stan_data, warmup = 2000, iter = 4000, chains = 4, cores = 2, thin = 1)

fit

kvaobs <- numeric(length(y))
kva_val95 <- numeric(length(y))
kva_val5 <- numeric(length(y))
rmse_val <- numeric(length(y))
for(i in 1:length(x1)){
x1_new <- x1[i]
x1_loo <- x1[-i]
x2_new <- x2[i]
x2_loo <- x2[-i]
y_loo <- y[-i]
stan_data <- list(N = length(y_loo), x1 = x1_loo, x1_new = x1_new, x2 = x2_loo, x2_new = x2_new, y = y_loo, nu = nu)

# Modell
mod <- stan(file = "stan_model1.stan", data = stan_data)
# Get predictive distribution
y_pred <- extract(object = mod, pars = "y_new")\$y_new

E_predicted <- mean(y_pred)

# RMSE
rmse_val[i] <- (E_predicted - y[i])^2

#Percentiler
kva_val5[i] <- quantile(y_pred, probs = 0.05)
kva_val95[i] <- quantile(y_pred, probs = 0.95)
kvaobs[i] <- ecdf(y_pred)(y[i])

}

``````

Do I need to specify a new stan model and log the model? I have tried with a couple of things without success and therefore I would appreciate if someone have any advice how to solve this.

2 Likes

Hi,
the log-predictive density can be calculated by computing the density of the likelihood at your observed data given the parameters in the individual draws. You can compute it within your stan program as discussed e.g. at Writing Stan programs for use with the loo package • loo but you can also compute it in R - conceptually the R code would be something like:

``````for all samples s {
for all data n {
mu = x1[n] + x2[n] * beta2[s]
log_lik[s, y] = dt(x = (y[n] - mu) / sigma[s], df = nu[s],  log = TRUE)
}
}
``````

(where I am scaling and shifting the posterior manuallz by `sigma` because R doesn’t have the scaled/shifted t distribution). Please check it is correct, I do make mistakes :-)

Best of luck with your model!

Great! Thanks for the input. I solved it a couple of days ago with the following code. My stan model:

``````write("// Stan model for simple linear regression

data {
int < lower = 1 > N; // Sample size
// real<lower=0> nu;
vector[N] x1; // Predictor
vector[N] x2; // Predictor
vector[N] y; // Outcome

real x1_new;
real x2_new;

real y_new;
}

parameters {
real beta1; // Slope (regression coefficients)
real <lower = 0> sigma; // Error SD
real <lower = 0> nu_raw;

}

transformed parameters {
real < lower = 2 > nu;
nu = nu_raw + 2;
}

model {
nu_raw ~ gamma(2, 0.1);
y ~ student_t(nu, x1 + x2 * beta1, sigma);
}
generated quantities {
real log_lik;
real y_pred;
y_pred = student_t_rng(nu, x1_new + x2_new * beta1, sigma);
log_lik = student_t_lpdf(y_new |nu, x1_new + x2_new * beta1, sigma);
}
// The posterior predictive distribution",

"stan_model1.stan")
``````

And my loop for lpd, I have excluded the code for the other calculations.

``````for(i in 1:length(x1)){
x1_new <- x1[i]
x1_loo <- x1[-i]
x2_new <- x2[i]
x2_loo <- x2[-i]
y_new <- y[i]
y_loo <- y[-i]
stan_data <- list(N = length(y_loo), x1 = x1_loo, x1_new = x1_new, x2 = x2_loo, x2_new = x2_new,
y = y_loo, y_new = y_new, nu = nu)

# Modell
mod <- stan(file = "stan_model1.stan", data = stan_data)
#lpd
ll <- extract(object = mod, pars = "log_lik")\$log_lik
E_ll <- mean(ll)
lpd[i] <- E_ll

}

``````
1 Like