# Aic & bic

Can i theoretically calculate the aic & bic from log_likelihood in the generated quantities? (apart from loo &waic)

# load libraries
library(rstan)
library(loo)
# Liverpool's Data Premier League scores 2018-2019
liverpool = c(4,2,1,2,2,3,1,0,1,4,1,2,3,1,3,4,3,2,4,5,1,1,4,1,1,3,0,5,0,4,2,2,3,2,2,5,3,2)
length(liverpool)
mean(liverpool)
var(liverpool)

# estimating Gamma parameters given the mean and the var of the data
estGammaParams <- function(mu, var) {
alpha <- mu^2 / var
beta <-  mu  /  var
return(params = list(alpha = alpha, beta = beta))
}
# gamma parameters to pass in gamma prior
estGammaParams(2.25,1.125)

# 1st model of Poisson-Gamma Model
liv ="data {
int N;
int y[N];
}parameters {
real<lower=0> lambda;
}model {
lambda ~ gamma(4.5,2);
y ~ poisson(lambda);
}
generated quantities {
vector[N] log_lik;
for (n in 1:N) {
log_lik[n] = poisson_lpmf(y[N] | lambda);
}} "

# data as a list ALWAYS
stan.dat <- list(N = length(liverpool),y = liverpool)

# fit the sampling model to HMC
fit <- stan(model_code  = liv,
data       = stan.dat,
iter       = 2000,
chains     = 4,
seed       = 12345,

# print posterior mean of lambda parameter
print(fit,pars = "lambda")

# plot parameter
plot(fit,pars = "lambda")

# extract log_lik from posterior samples
log_lik_1 <- extract_log_lik(fit,
parameter_name = "log_lik",
merge_chains = FALSE)

N = length(liverpool)
k = 1
deviance = -2*sum(log_lik_1);deviance
Aic = deviance + 2 * k ;Aic
Bic = deviance + log(N) * k;Bic

r_eff1 <- relative_eff(exp(log_lik_1))
loo_1 <- loo(log_lik_1, r_eff = r_eff1, cores = 4)

print(loo_1)
plot(loo_1)


Hi Nikos!

AIC (and BIC) are not Bayesian quantities. Taken from wikipedia:

Suppose that we have a statistical model of some data. Let k be the number of estimated parameters in the model. Let \hat{L} be the maximum value of the likelihood function for the model. Then the AIC value of the model is the following.[1][2]

\text{AIC} = 2k-2\log(\hat{L})

…as you see, AIC relies on the maximum likelihood estimator. From a Bayesian view the maximum of the likelihood function is not that interesting, as we work with the whole posterior distribution. For model assessment you can use WAIC, or better yet LOO – which you already use! :)

What do you want to use AIC or BIC for? Comparisons with MLE methods?

3 Likes

I agree that BIC is not properly Bayesian. AIC can be viewed as an approximation to a Bayesian calculation; see here:
http://www.stat.columbia.edu/~gelman/research/published/waic_understand3.pdf
and our followup paper here:
http://www.stat.columbia.edu/~gelman/research/published/loo_stan.pdf

2 Likes

thank you professor for your reply…so it make sense to evaluate AIC and DIC from the posterior samples?
i am aware of waic and loo criteria, my concern is about the other common criteria of penalised likelihood.

You can do them all and compare. I just follow Aki Vehtari’s advice and think of all these methods (except for BIC) as estimates of out of sample prediction error. I don’t really like BIC for any reason (see various discussions online and in my books for more on this).

once again i am grateful.thank you professor

You need to know the number of parameters k and the MLE estimate for your model to be able to calculate AIC. MLE is just \arg\max_\theta \{P(X|\theta)\}. It’s not obvious to me how you may do that from generated quantities, but you could use the output of your Stan model (presumably sampled using MCMC), to set the starting values and run an optimizing on the same model. That will get you MAP which is just \arg\max_\theta \sum_i{log (P(x_i\mid \theta))} + log(P(\theta)) (given a constant prior) which is \arg\max_\theta \sum_i{log (P(x_i\mid \theta))} + const which is MLE + const.

So then you can either use MAP instead of MLE or correct remove the const to get the real MLE as you wish. Either way I think it should get you what you want.