We are trying to add an option for users to manually specify a distribution that will only be calculated proportional to a constant (i.e. it will not be normalized). The PR is mostly done I believe and Niko has been great with his feedback and ideas, but I want some more eyes on it, just to double-check if we missed anything and if anyone has more thoughts on this.
Stanc3 currently does support the so-called _propto_
infix, but this was never documented and thus not released. We want to release this for 2.24 if possible. But not going to push it, if we feel it should be done in some other way.
There was already a discussion on this previously (and also a poll), that I will try to quickly summarize. For those that wish to wish to read through previous discussions here are the main links:
- Statisticians: _propto vs _unnormalized (vs ?)
- https://github.com/stan-dev/stanc3/issues/63
- https://github.com/stan-dev/stanc3/issues/541
As @Bob_Carpenter mentioned in the previous Discourse thread
target += foo_lpdf(y | theta);
and
y ~ foo(theta);
behave differently. The functional form and hence target +=
includes normalizing constants (like logâ2Ď in normal_lpdf
). The sampling statement form (with ~
) drops normalizing constants in the samplers and optimizers.
Based on the poll done in the linked Discourse thread, the version that âwonâ was the _lupdf
suffix for unnormalized form.
Meaning that
target += foo_lupdf(y | theta);
would be equal to
y ~ foo(theta);
as in both would drop the normalizing constant as the tilde statement already does. And
target += foo_lpdf(y | theta);
would stay the normalized form as it is currently.
The unnormalized form would be allowed in the model block, the _lpdf
/ _lpmf
function body, and the _lp
function body. It would not be allowed in any other block. For reasons on why see Nikoâs comment here.
The suffix would work for all lpdfs/lpmfs defined in the Stan Math library, as well as for user-defined lpdfs/lpmfs. The way it would work for user-defined functions is described in the example below. See comment next to the calls.
functions {
real foo_lpdf(real y, real x) {
return normal_lupdf(y| x, 1);
}
real goo_lpdf(real y, real x) {
return normal_lpdf(y| x, 1);
}
}
parameters {
real y;
}
model{
target += foo_lpdf(y| x); // normal_lpdf would be normalized
target += foo_lupdf(y| x); // normal_lpdf would be unnormalized
target += goo_lpdf(y| x); // normal_lpdf would be normalized
target += goo_lupdf(y| x); // normal_lpdf would be normalized
}
And a UDF with a _lupdf suffix would not be allowed.
functions {
real foo_lupdf(real y, real x) {
//...
}
}
There is also an option to allow for a stanc flag (for example ââunnormalizedâ) that would use the unnormalized form in all applicable cases. This is not implemented right now, but could also be if others find it useful.
If anyone has any thoughts on this, please comment. I donât have any strong feelings on _lupdf
vs _unnormalized_lpdf
etc. We merely implemented what was the answer that âwonâ. But there is still time to rethink this.
Please tag anyone you feel might have thoughts on this.