# How exactly is lp__ being calculated

Sorry if this is not the right category.
I want to understand how lp__ is being calculated for a very simple model, say I have a model like this:

``````data {
int<lower = 1> N;
vector[N] y;
}
parameters {
real mu;
real<lower = 0> sigma;
}
model {
target += normal_lpdf(mu | 0, 20);
target += lognormal_lpdf(sigma | 3, 1);
target += normal_lpdf(y | mu, sigma);
}
``````

with data:

``````Y <- rnorm(n = 100, mean = 3, sd = 10)
``````

I would assume I can get the same value of lp__ for a given sample by doing the following in R:

``````mu <- extract(fit_score)\$mu
sigma <- extract(fit_score)\$sigma
addmu <- dnorm(x = mu[1], mean = 0, sd = 20, log = TRUE)
addsigma <- dlnorm(x = sigma[1], mean = 3, sd = 1, log = TRUE)
addY <- sum(dnorm(Y,  mu[1], sigma[1], log = TRUE))
``````

But I donâ€™t, I get similar values -371 by hand and -369 with `extract(fit_score)\$lp__[1]`. Is this because of differences between dnorm/dlnorm and stan lpdf? Or is there something else that is added or not added in Stan in comparison with my R code?

3 Likes

(I donâ€™t know; Jacobian?)

If you change lognormal to normal, can you then reproduce the numbers?

edit.

4 Likes

yes, the answer is always the jacobian :)
The difference is not because the lognormal, but the `<lower = 0>`. If I remove it, I get the same results.

5 Likes

The variable `lp__` starts at 0.

The parameters block increments it with the log absolute Jacobian determinant of the transform for any constrained variables.

The model block increments `lp_` with the value of the right-hand side of any `target += ` statement.

The model block increments `lp_` with each sampling statement, adding the log density, but dropping constants that do not depend on parameters. If youâ€™d rather keep the proportionality constants, use `target +=`.

Thatâ€™s it. None of the other blocks touch it.

Its final value is the log density on the unconstrained parameter scale up to a constant that doesnâ€™t depend on the parameters (and stays fixed iteration to iteration).

The value of `lp__` at any given point is what you get from `target()` if you want to track it by printing. For example, you can print at the top of the model to see the starting point being the change-of-variables adjustment. Or compare the effect of `~` vs. `target +=`.

4 Likes

@Bob_Carpenter great explanation. I think this information is spread over various documents, so thereâ€™s no place itâ€™s so concise like this. Iâ€™m not surprised that there are questions about it. I like @bnicenboimâ€™s post title â€śHow exactly is lp__ being calculated?â€ť for an entry in a hypothetical FAQ or FAQ-like resource, so @martinmodrak may be interested.

3 Likes

Thanks, @Bob_Carpenter, that was really clear. Parameters with <lower=X> are back transformed to the constrained space with exp(auxiliary parameter) + X, right? How are the other parameters backtransformed? Is it written somewhere?
And another small thing, maybe it would be a good idea to return `target` rather than `lp__` in the summary?

1 Like

Chapter 10 has all the transformations + Jacobian determinants: 10 Constraint Transforms | Stan Reference Manual

2 Likes

Thanks! This is new (and great!)

I agree that itâ€™s weird to use `target` in the Stan program but then see `lp__` in the output. Before `target+=` we had `increment_log_prob()`, which has a more transparent connection to the name `lp__`. Not replacing `target` with `lp__` in the output may have just been to avoid breaking backwards compatibility, I forget. We deprecated `increment_log_prob()` but it still works. Itâ€™s harder to deprecate the name of a variable in the output.

Once upon a time, I did a few talks around this. Slides and examples here: Slides from "How Stan computes the posterior distribution" Â· syclik

(unfortunately, my slides arenâ€™t really super useful without contextâ€¦ I could probably record a video.)

Iâ€™ve brought it up before, but Iâ€™d really love to see our current `lp__` broken into two parts:

• the Jacobian term
• the modelâ€¦ essentially the part thatâ€™s uses `target +=`

I think it would help with understanding the effect of the transformation (whether weâ€™re ending up in bad numeric areas from the constraining operation) and tracking through problems with the model. Right now, if thereâ€™s an bug in the model, I need to trap it by print by placing `print` statements before and after the offending lines. If it were split out, I could run the likelihood in a separate language, like R, and get results to compare.

3 Likes

Itâ€™s actually been in the manual since version 1. I think the reference manual is probably getting lost in the shuffle these days after being split out into a separate volume.

And back in the days when we used `<-` for assignment, we had

``````lp__ <- lp__ + alpha;
``````

Removing `lp__` as a user-modifiable variable was the only backwards-compatibility breaking change weâ€™ve made to the language.

I got tired of suggesting this because I was the only one speaking up for it. Specifically, I wanted to be able to have

``````transformed parameters {
real alpha = exp(alpha_unc);

jacobian += exp(alpha);
}
``````

so that we could define Jacobians for transformed parameters right in the transformed parameters block. Then lp__ would be a combianation of the constrained model log density plus the log Jacobian determinant of the constraining transform.

Without this, we canâ€™t actually implement Stan transformation in Stan. In uses for optimization, we turn the Jacobian off. Thereâ€™s now way to write a Jacobian that gets turned off for optimization in the Stan language as it stands.

But, when I implemented the first version of the Stan language, I forgot to allow `lp__` modifying statements in transformed parameters. Then everyone got fixated on all `lp__` modification happening in the model block. Thatâ€™s also why we donâ€™t allow sampling statements in the parameters block for priors.

But I still like it, and if I were king of Stan, weâ€™d have the above syntax.

3 Likes

So, to make sure, if I have declared a parameter `real<lower=0> theta` and want its square to have a positive-truncated standard normal prior, is it correct to have

`target += normal_lpdf(square(theta) | 0, 1) + log(fabs(2*x));`

in the model block? So I donâ€™t have to account for the <lower=0> truncation myself, only the squaring? And if I instead had a transformed parameter `real theta2 = square(theta)`, then I would not need `+ log(fabs(2*x))`?

Either I missed that or, more likely, I didnâ€™t understand the impact until I made those slides. Soâ€¦ now Iâ€™m speaking up with you!

I remember the idea being floated a while ago but I donâ€™t remember what the reasons put forward against it were. To me splitting it up like this seems like a nice idea.

Maybe itâ€™s worth bringing up again. There could be more support for it now.

1 Like

Right. We apply the Jacobian for the lower bound so itâ€™s (improper) uniform on positive values. Same for lower and upper bounds, where itâ€™s uniform in that range.