How can I acquire the log likelihood from "loo"?

hawe66 · June 16, 2021, 2:29pm

I generated log_lik at the generated quantities block,
and I extracted loo by

LL_2 <- extract_log_lik(output, parameter_name = "log_lik", merge_chains = TRUE)
loo_2 <- loo(LL_2, save_psis = TRUE, cores = 4)
waic_2 <- waic(LL_2, save_psis = TRUE, cores = 4)

Anyway, I’m curious if elpd_loo and waic_loo are the same as the conventional log-likelihood…
They produced the same value but not the value that I expected (so close to zero).
Are they the real log-likelihood or should I have to multiply somehow?

jonah · June 16, 2021, 10:03pm

Can you clarify a bit more? The real log-likelihood values are the ones you computed in your Stan program and extracted, but I’m guessing I’m not understanding what you’re actually asking.

jsocolar · June 17, 2021, 3:55am

As datasets get large, likelihoods get really small. For a really simple example, consider the Bernoulli likelihood for p = 0.5. With one data point, that likelihood will be 0.5. With 2 data points, the likelihood will be .25. With N data points, the likelihood will be 2^{-N}. By the time you get to even modest-size datasets, the likelihood gets vanishingly small. In fact, this is part of the motivation for working with log-likelihoods instead of likelihoods; the likelihood itself tends to underflow to zero. You can expect likelihoods close to zero for models with more than a handful of data points.

hawe66 · June 17, 2021, 2:10pm

Sorry for unclear explanation.
I declared vector[25] log_lik and I got [1:6000,1:25] size of log_lik after MCMC sampling.
And I extracted log-likelihood by loo package and this code.

output = stan("steak.stan", data = dataList, 
              control = list(adapt_delta = 0.99, max_treedepth = 15),
              pars = c("w", "pi", "lambda", "alpha", "log_lik"), init_r=5,
              iter = 6000, warmup=1000, thin=3, chains=4, cores=4)

LL_2 <- extract_log_lik(output, parameter_name = "log_lik", merge_chains = TRUE)

loo_2 <- loo(LL_2, save_psis = TRUE, cores = 4)

print(loo_2)

But my question was on the result of print(loo_2)

Computed from 6668 by 25 log-likelihood matrix

         Estimate  SE
elpd_loo    -10.5 1.4
p_loo         0.2 0.0
looic        21.0 2.9
------
Monte Carlo SE of elpd_loo is 0.0.

The expected log likelihood is above 1000, based on the prior research.
But estimated elpd_loo was 10.5. Is this appropriate value, or did I do something wrong?

jsocolar · June 17, 2021, 2:18pm

elpd_loo is not the same thing as the fitted log likelihood. See here LOO package glossary — loo-glossary • loo
But also, it’s very difficult for me to imagine a dataset that yields a likelihood of e^{1000}. I’m not sure what you mean by “the prior research”, but are you sure this is a reliable number?

Edit: I guess this is possible if the likelihood functions are really strongly peaked.

hawe66 · June 17, 2021, 2:29pm

Thanks for the kind explanation!

Actually I was trying to analyze the open dataset, and the BIC score from the original article was around 6400. This is the summation of the BIC score of 25 participants’ data, and each of them did 180 trials. (So that being the summation of 25 * 180 = 4500 log-likelihoods.)

Anyway, if elpd_loo doesn’t actually mean the fitted log-likelihood, then how can I compute the fitted log-likelihood with loo package?

jsocolar · June 17, 2021, 2:31pm

The BIC uses the negative log likelihood.

hawe66 · June 17, 2021, 2:33pm

Oh I just didn’t mention the sign. Their BIC score was -6149.

jsocolar · June 17, 2021, 2:43pm

Sum over the rows of loo::extract_log_lik(..., merge_chains=TRUE). This assumes of course that the log likelihood is correctly computed in your stan model itself.

jsocolar · June 17, 2021, 2:51pm

Note also that the BIC uses the maximum value of the likelihood function, whereas when fitting the model you’ll get the “typical” value of the likelihood function. These can be rather different.

hawe66 · June 17, 2021, 3:00pm

Thanks a lot! I learned a lot from your replies!!! Have a really really nice day :)

Topic		Replies	Views
Extract log-likelihood from large size stanfit using the Loo package General loo	4	2289	June 7, 2018
Log_lik for LOO vs lp__ Modeling loo	6	1968	October 3, 2017
Loo with only partial log likelihood data RStan loo	1	373	May 22, 2021
Log likelihood > 0 for stanreg objects in rstanarm Modeling loo , rstanarm	4	662	October 2, 2020
Calculate log_lik from model using less memory Modeling loo	4	695	August 28, 2019

How can I acquire the log likelihood from "loo"?

Related topics