Model comparison - lp__

Jennifer · September 11, 2017, 2:43pm

Hi!

I have various runs of the same model (and sometimes with one slightly changed parameter) and I want to compare models to get the model that fits my data the best. I was told this can be seen by the lp__ value, which is printed out in the model summary. Now my (stupid) question is, which value is better?
Lets say I got those values (I know, they are really close, but nevertheless), which model would be better?
-17169.14
-17122.16
I think I do not fully understand the meaning of log posterior…

Help would be very appreciated!

bgoodri · September 11, 2017, 3:15pm

You are going to have to understand that to make good use of Stan, regardless of its (in)applicability to model comparison. In short, it is the kernel of the posterior density in log-units, possibly ignoring some or all constants. It is not useful for model comparison. Most people around here prefer to do comparisons involving (functions of) the log-likelihood, which excludes the contributions from the priors and the Jacobians for the transformations from the constrained space to the unconstrained space. See this paper:

Other people prefer to calculate the probability that each of several models is correct, conditional on at least one of them being (very close to) correct. See this paper

for some cautions about that approach, but if you are going to use it then you need the posterior density including the constants. If you have that, then you can use

See especially its tutorial.

alfred · September 12, 2017, 4:45pm

The original question asked about comparing different runs on the same model, as well as different models.
If you are just comparing different runs, then surely the simple log probability is fine.

Jennifer · September 15, 2017, 8:06am

Thank you. And do you know if a smaller value is better or a larger value?

Bob_Carpenter · September 17, 2017, 9:39pm

Logs are monotonic, so higher density is also higher log density. But what you’re looking for in sampling is to sample around the typical set, not around the mode.

Also, you probably don’t want to use lp__ as your yardstick as it drops constants and includes all the Jacobian adjustments to allow sampling to happen on the unconstrained scale.

avehtari · September 18, 2017, 6:21am

Why would you compare different runs on the same model and the same data?
The original question explicitly mentions model comparison. You can’t do that with just lp__ values.

Bob_Carpenter · September 19, 2017, 11:18pm

By “same model”, we mean the same log density. So there may be some misunderstanding in terminology here.

Jennifer · September 20, 2017, 11:14am

By “same model”, we mean the same log density. So there may be some misunderstanding in terminology here.

Oh that might be the case…
Actually, I am quiet confused right now :D

Lets just say, I have the same model code and I am starting some runs with this same model code but with random generated starting parameters, so there is always a slightly different outcome of the different runs.
Now I just want to know, which run was the best, and I read this was possible to answer by looking at the lp__ value, but I am not sure which lp__ value is the better one, a more negative one?

Bob_Carpenter · September 20, 2017, 5:32pm

You should just do that with multiple chains in a single run. Or if you do it in CmdStan, then you should be combining the chains running the same model for posterior analysis.

You should not be looking for equal runs of the same model and finding the best. If they’re not all the same, you have convergence problems.

Murali_037 · March 5, 2020, 4:23pm

@Bob_Carpenter Let’s say I wanna do multiple runs with different prior values. For example: (gamma(2,17) , gamma(17,2), gamma(3,17), … etc etc) to see the difference in the results.

How can I do that with multiple chains in a single run? Can you please illustrate an example if possible ? I am trying to learn how to do this

Thanks in advance,

Murali_037 · March 5, 2020, 5:48pm

@bgoodri, when you say the kernel of the posterior density, you are referring to the posterior distribution in log units?

Murali_037 · March 5, 2020, 6:21pm

got it , kernel of the posterior basically omits the normalization constant that does not depend on the parameter. I just did not know the term for this is “kernel”

avdminne · April 22, 2021, 1:45pm

Hi everyone,

sorry to revive this topic, but I sort of have the same issue.
I am running a Bayesian ANN, and I am having the issue that I get multiple answers while running the same model on the same data. In other words, I get a very low MC error per chain, but the chains do not converge (i.e. high Rhats). I actually recently made a post about it;

In that post I was forwarded to the following link;

https://www.martinmodrak.cz/2018/05/14/identifying-non-identifiability/

In that link you can see under “Neural network: When ordering is not enough”, it says;

This means that to identify the model we need have to somehow choose one of those modes, but there is clearly not a “best” (much higher lp__) mode.

However, in my case, I actually can see that there is a difference in lp between the “solutions”. (Sorry if I mix up terminology here…)

If I run 8 chains, is there a way to “pick” the best chains automatically based on the lp_, or are there better ways to do it?

(PS: my goal is to mix machine learning with structural time series, hence I use Stan for ANN.)

avehtari · April 23, 2021, 12:36pm

See Stacking for non-mixing Bayesian computations: The curse and blessing of multimodal posteriors.

avdminne · April 23, 2021, 4:03pm

That’s brilliant! Model stacking, who would have thought.
I went through the paper and implemented the code successfully to my own by now.

Thank you for this. Love the research you guys are doing, it’s very applied and immediately usable.

Topic		Replies	Views
Positive log probability Modeling fitting-issues , cognitive-science	10	3067	July 13, 2018
Yet another lp__ question General	7	817	April 11, 2018
Two different models have the same log likllihood Modeling	2	175	June 18, 2024
Trying to understand lp__ as it is not producing the results that I expect when modelling count data using poisson distribution Modeling	12	1287	September 6, 2020
Basic question: what is lp_ in posterior samples of a brms regression? brms	3	2520	May 19, 2024

Model comparison - lp__

Related topics