Log likelihood > 0 for stanreg objects in rstanarm

I have several models and applied waic/loo on model comparison.
I noticed that sometimes elpd_waic > 0 and then waic < 0.
If my understanding is correct, waic and looic should be > 0 ?
So I traced back and found that the log-likelihood matrix is kind of weird.
Here’s some info on model:

Model Info:

 function:     stan_mvmer
 formula (y1): Y ~ month + I(month^2) + (month | id)
 family  (y1): gaussian [identity]
 algorithm:    sampling
 priors:       see help('prior_summary')
 sample:       1000 (posterior sample size)
 num obs:      77 (y1)
 groups:       id (25)
 runtime:      <0.1 mins

Estimates:
                                          mean    sd      2.5%    25%     50%     75%     97.5%
y1|(Intercept)                            0.067   0.030   0.009   0.049   0.067   0.086   0.125
y1|month                                  0.035   0.052  -0.070   0.001   0.036   0.071   0.139
y1|I(month^2)                             0.022   0.017  -0.012   0.011   0.022   0.034   0.056
y1|sigma                                  0.145   0.014   0.121   0.136   0.145   0.153   0.176
y1|mean_PPD                               0.180   0.024   0.128   0.165   0.179   0.196   0.224
Sigma[id:y1|(Intercept),y1|(Intercept)]   0.003   0.004   0.000   0.000   0.002   0.005   0.015
Sigma[id:y1|month,y1|(Intercept)]         0.001   0.002  -0.004   0.000   0.000   0.002   0.005
Sigma[id:y1|month,y1|month]               0.009   0.005   0.003   0.006   0.008   0.011   0.021
log-posterior                           -50.256   7.200 -65.446 -54.853 -50.005 -45.344 -37.229

Diagnostics:
                                        mcse  Rhat  n_eff
y1|(Intercept)                          0.001 1.000 1079 
y1|month                                0.002 1.001  639 
y1|I(month^2)                           0.001 1.001  782 
y1|sigma                                0.000 0.999  838 
y1|mean_PPD                             0.001 1.000  889 
Sigma[id:y1|(Intercept),y1|(Intercept)] 0.000 1.001  454 
Sigma[id:y1|month,y1|(Intercept)]       0.000 1.007  114 
Sigma[id:y1|month,y1|month]             0.000 1.007  311 
log-posterior                           0.533 1.004  183 

I run 2 chains with 1,000 iterations for each chain.
Here’s the histogram of log likelihood of fitted model:

I know it is possible that log likelihood might be > 0, but this is weird…
I don’t know the exact reason, but I multiply Y by 10.
Before, the range of Y is [0.05, 1.14], now it is [0.5, 11.4].
The model info:

Model Info:

 function:     stan_mvmer
 formula (y1): Y ~ month + I(month^2) + (month | id)
 family  (y1): gaussian [identity]
 algorithm:    sampling
 priors:       see help('prior_summary')
 sample:       1000 (posterior sample size)
 num obs:      77 (y1)
 groups:       id (25)
 runtime:      <0.1 mins

Estimates:
                                          mean     sd       2.5%     25%      50%      75%      97.5% 
y1|(Intercept)                             0.663    0.304    0.071    0.471    0.661    0.862    1.271
y1|month                                   0.369    0.541   -0.670    0.014    0.381    0.750    1.394
y1|I(month^2)                              0.218    0.190   -0.149    0.091    0.212    0.347    0.597
y1|sigma                                   1.456    0.145    1.219    1.346    1.444    1.551    1.775
y1|mean_PPD                                1.794    0.231    1.368    1.641    1.805    1.936    2.273
Sigma[id:y1|(Intercept),y1|(Intercept)]    0.303    0.380    0.000    0.036    0.152    0.424    1.331
Sigma[id:y1|month,y1|(Intercept)]          0.072    0.240   -0.477   -0.033    0.067    0.199    0.551
Sigma[id:y1|month,y1|month]                0.895    0.463    0.259    0.576    0.810    1.106    2.086
log-posterior                           -230.414    7.409 -245.770 -235.059 -230.053 -225.311 -216.799

Diagnostics:
                                        mcse  Rhat  n_eff
y1|(Intercept)                          0.010 1.001  900 
y1|month                                0.022 1.002  594 
y1|I(month^2)                           0.007 1.001  691 
y1|sigma                                0.006 1.007  499 
y1|mean_PPD                             0.007 0.999 1180 
Sigma[id:y1|(Intercept),y1|(Intercept)] 0.019 1.009  384 
Sigma[id:y1|month,y1|(Intercept)]       0.023 1.019  107 
Sigma[id:y1|month,y1|month]             0.029 1.010  250 
log-posterior                           0.651 1.024  130 

And log likelihood:

now it seems common, I guess…
Is this multiplication, Y * 10 , the right way to fix this problem?
And the weird value of elpd_waic and waic is caused by this problem?
Any help would be appreciated!

BTW, the 0.05 is a kind of threshold value.
In 77 observations, 49 of them are 0.05.
The Y of some subjects increases, but the Y of some other subjects just stay at 0.05 ovetime.

Good question. If the likelihood is between 0 and 1 then the log-likelihood will be negative, but the likelihood is not constrained to be less than 1 so it’s possible for the log-likelihood to be positive. This means that the ELPD computed by loo can also be positive or negative.

There’s a bit more info in the first entry here (it refers to elpd_loo but same is true for elpd_waic):

elpd_loo is the Bayesian LOO estimate of the expected log pointwise predictive density (Eq 4 in VGG2017) and is a sum of N individual pointwise log predictive densities. Probability densities can be smaller or larger than 1, and thus log predictive densities can be negative or positive.

1 Like

Thanks, I see. And I have one more question regarding to the pareto-k diagonostics.
This dataset contains 25 subjects and 77 observations, I selected 24 subjects (73 observations) as train dataset, leave one subject (4 observations) as holdout data.
For the fitted model

Pareto k diagnostic values:
                         Count Pct.    Min. n_eff
(-Inf, 0.5]   (good)     45    61.6%   210       
 (0.5, 0.7]   (ok)       22    30.1%   171       
   (0.7, 1]   (bad)       5     6.8%   12        
   (1, Inf)   (very bad)  1     1.4%   6         

I assume it is generally acceptable, which means the model is not badly misspecified.
But after specifying holdout data as newdata in log_lik function,

     elpd_loo mcse_elpd_loo      p_loo      looic influence_pareto_k
25  -1.483282    0.04369236  0.1106539   2.966563          0.9634965
26  -2.883195    0.42590082  1.2587956   5.766391          1.1475562
27 -23.120600    0.82810545 20.8118162  46.241199          4.2885984
28 -73.703965    1.15532140 69.5372476 147.407930          9.5307370

This looks terrible.
And I tried using other subject as holdout data, the results are not satisfying.
In this scenario, should I consider this model misspecified?
(or, loo/waic may not give stable estimates when dealing with small sample size?)

Actually I don’t know if I compared models in the right way…
Say I have 3 models, and I trained them on 24 subjects (73 observations). The goal is to obtain good predictions on the rest one holdout subject.
I’m thinking of comparing models like

models <- list(fit1, fit2, fit3)
log_lik_list <- lapply(models, log_lik, newdata = holdout_data)
waic_list <- lapply(log_lik_list, waic)
loo_compare(waic_list)

And choose one best model to predict the holdout data.
I’m wondering if I should use

log_lik_list <- lapply(models, log_lik)

Which way is better (or the right) way to compare models, when the goal is to predict holdout data.