Pystan log-probability confusing!

How do you call logprob?

Given a working example. Yes, I use flat/uniform prior on the parameters.

Could not understand. Anyway, I am getting the pystan logprob from the fit, the last column.

See how lp__ is calculated here: How exactly is lp__ being calculated

The important piece you are missing here is

So if you would not have constraints then lp__ would probably match exactly. But that will obviously not help you fit the model.

Yes, when I remove the prior they match! But, I don’t understand the outcome. When I am using uniform prior, all points within the prior ranges have equal weightage. So why the probability is not tracing the chi^2 (no correlation assumed)? Worse, the maximum of lp_ is not the minimum of \chi^2 among the points it sampled. How to address that?

Edit: I am very much interested in finding the maximum likelihood also (together with the posterior).

I suggest reading the entire thread here How exactly is lp__ being calculated and the chapter of the reference manual https://mc-stan.org/docs/2_24/reference-manual/variable-transforms-chapter.html

1 Like

Just confirm me one thing. Assume I have uniform priors. After the fit, the samples trace the lp_ of the stan or the log prob calculated like above ?

The lp__ samples that you get when fitting in any Stan interface will return the lp__ that Stan computes. That is, including the log absolute Jacobian determinant of the transform for constrained variables.

Still, where is the problem? I have a parameter A with a lower cutoff. Stan transforms it to another function, say B using log. Now the jacobian, if done properly, should take care of this transformation and the inverse transformations; Why it differs slightly from the analytical value (if the jacobian is calculated correctly)? This is like saying your mass changes if I measure in pounds instead of grams!

I am not sure if I can explain it better than other have in the previous thread and the chapters, sorry.

Maybe the code example will help see what happens to lp__ if you have a constrain.

real<lower=0.0, upper=20> A;

is compiled to C++ as

if (jacobian__) {
     current_statement__ = 1;
     A = stan::math::lub_constrain(A, 0.0, 20, lp__);
} else {
     current_statement__ = 1;
     A = stan::math::lub_constrain(A, 0.0, 20);
}

If jacobian__ is true, which in this case it is, that will call this function here:

See what increments lp__ in that case.

You can try calling log_prob in pystan directly (see https://pystan.readthedocs.io/en/latest/api.html#pystan.StanFit4model.log_prob) with adjust_transform true and false and observe what is going on.

1 Like

BTW: if I change to:
y~normal(F,yerror);
the lp_ values are completely different. Why? Is not it the same thing as target+=normal_lpdf(...) ?

I would also add that densities are weird and change under parameter transformations (see case studies at https://betanalpha.github.io/writing/ for some good intro into relevant theory). The lp reported by Stan for HMC/NUTS is with respect to unconstrained parameters, because that’s what is needed to make the algorithm work (and hence what you want to look at for diagnostics etc.), it unfortunately cannot at the same time be identical to what you compute by hand on constrained parameters. I understand this may be confusing, but I was assured the math checks out :-)

1 Like

No, the sampling statement (~) only computes up to a constant (proportional to the exact density f).

See Log Probability Increment vs.Sampling Statement in https://mc-stan.org/docs/2_18/reference-manual/sampling-statements-section.html

Mainly this paragraph:

The sampling statement drops all the terms in the log probability function that are constant, whereas the explicit call to normal_lpdf adds all of the terms in the definition of the log normal probability function, including all of the constant normalizing terms. Therefore, the explicit increment form can be used to recreate the exact log probability values for the model. Otherwise, the sampling statement form will be faster if any of the input expressions, y , mu , or sigma , involve only constants, data variables, and transformed data variables.

1 Like

Yes, I understand the probability density P(A) is not same as P(B) after we transform A -> B. But what I do not understand why stan gives me P(B) (if it gives this in the first place), not P(A) [of the original constrained parameter], if it has the jacobian calculated correctly. This is not nontrivial! Not doing this inevitably creates confusion.

P.S. I want to understand one single thing in the STAN/Pystan developments before I die.

Okay. Thanks. But it also has the same mismatch if we have constrained parameters.

If this is true that stan reports probability density with respect to unconstrained parameters, is there a way that I also store probability density w.r.t. the original constrained parameters at each point while sampling? This could be easier for me.

Because B is what Stan samples internally and log_prob uses to Stan’s internal representation. Note that log_prob takes unconstrained parameters so if you have constrained parameter values you have to fit.unconstrain_pars(...) them before calling fit.log_prob(...).

You can get \log P(A) with fit.log_prob(upars, adjust_transform=False)

3 Likes

The documentation is confusing, clashing with older github contents. Can you please elaborate how to call fit.unconstrain_pars(...) correctly? What would be the pars?

For example, if the model is

parameters {
    real<lower=0.0, upper=20> A;
    real <lower=0, upper=10> B;
}

you would call

upar = fit.unconstrain_pars({"A": a, "B": b})
lp = fit.log_prob(upar, adjust_transform=False)
1 Like

The only problem now is that it calculates lp at each point, i.e. a or b can be just one number, not a list! Is there a way of giving all the sampled points in pars? If not, it’s better to calculate lp by hand.

Error for giving the fit.extract() as pars in fit.unconstrain_pars(pars):

RuntimeError: mismatch in number dimensions declared and found in context; processing stage=parameter initialization; variable name=A; dims declared=(); dims found=(2136)