# WAIC for any stanfit object

For a stanfit object fit, I use the following code to calculate WAIC,

   log_lik <- extract(fit)$lp__ # lppd <-sum(log(colMeans(exp(log_lik)))) lppd <-sum(log(mean(exp(log_lik)))) p_waic <- sum(stats::var(log_lik)) waic <- -2*lppd +2*p_waic  where fit is created with target formulation in a stan file. Is it correct? Or is there any existing function to calculate WAIC for any stanfit object? 1 Like Once you have your log_lik matrix, you can call loo::waic(log_lik), and the loo package will do the work for you. 1 Like Thank you I can do. (may be something wrong since NA appears.) > class(fitt) [1] â€śstanfitâ€ť attr(,â€śpackageâ€ť) [1] â€śrstanâ€ť > log_lik1 <- extract_log_lik(fitt,parameter_name = â€śAâ€ť) > waic1 <- loo::waic(log_lik1) > waic1 Computed from 889 by 1 log-likelihood matrix Estimate SE elpd_waic 0.5 NA p_waic 0.0 NA waic -1.1 NA Is the log-likelihood computed in the stan model? If so, then the corresponding parameter name should be passed to extract_log_lik (a name such as A is not particularly informative). If not, you need to compute it in the generated quantities block, with something like this if itâ€™s a linear regression model (adapt it for different models and for variable names you use): generated quantities { vector[N] log_lik; for (n in 1:N) { log_lik[n] = normal_lpdf(yt[n] | X[n] * beta, sigma); } }  1 Like Thank you for letting me know my misunderstand. I was wrong. I understand that I should substitute log-likelihood to the variable parameter_name in the function loo::extract_log_lik(), namely, I should substitute \{\log (f(y|\theta_i)\pi(\theta_i)) + \text{const};i=1,2,...,n\} where \{\theta_1,\theta_2,....\theta_n\} is a collection of posterior MCMC samples and f is a likelihood and \pi is a prior and y is a data-set. In my model, dataset is a collection of non-negative integers y=\{(H_c,F_c) \in \mathbb{N} \times \mathbb{N};c=1,2,...,5\}, which is distributed by H_c \sim \text{Binomial}(p_c(\theta),N)\\ F_c \sim \text{Poisson} (q_c(\theta)) Thus, in my case, such log_likelihood will be as follows: \sum_c binomial_lpdf(H_c|p_c(\theta)) + \sum_c poisson_lpmf (F_c|q_c(\theta))(+ const?) But I think Stan function sampling() already calculates such log likelihood and we can extract by extract(fit)$lp__.

Do I need to write codes in a generated quantities block to calculate log likelihoods for each posterior MCMC samples?

Is it wrong to use extract(fit)\$lp__ as the variable parameter_name in the function extract_log_lik?

lp__ is the log-posterior density up to a constant, and by extracting it you get one value per MCMC sample. But what you want is the log-likelihood of each observation at each MCMC sample, thatâ€™s why you need to compute it by hand.

I think your log-likelihood formulation is correct, now you need to express the maths in the stan model (all constants will be computed correctly when using the nomal_lpdf function and friends).

1 Like

I understand that lp__ retains log likelihood values \{\sum_i L(y_i|\theta_k)\}_{ k=1,2,...}, but, the code of generated block should be write to provide \{ L(y_i|\theta_j)\}_{i, j=1,2,...} for each data y_i and MCMC samples \theta_j.

And â€ścomputed correctlyâ€ť means the target+= as follows:

          target += binomial_lpmf( h[n]  |    ,     );
target += poisson_lpmf(  f[n]  |     );


                               h[n] ~ binomial(       );