Target += normal_lpdf(eta | 0, 1); vs. eta ~ normal(0,1)

Lawrence_Hunsicker · July 10, 2021, 10:09pm

I am pretty much a stan newbie, but I have found differences in output between various versions of the schools model depending on whether one specified the models using (e.g.)

target += normal_lpdf(eta | 0, 1);
vs.
eta ~ normal(0,1)

The full code from the rstan::rstan vignette is:

data {
int<lower=0> J; // number of schools
real y[J]; // estimated treatment effects
real<lower=0> sigma[J]; // s.e. of effect estimates
}
parameters {
real mu;
real<lower=0> tau;
vector[J] eta;
}
transformed parameters {
vector[J] theta;
theta = mu + tau * eta;
}
model {
target += normal_lpdf(eta | 0, 1);
target += normal_lpdf(y | theta, sigma);
}

Prior versions are the same except for the final model section:

model{
eta ~ normal(0, 1);
y ~ normal(theta, sigma);
}

I invoked the two versions using the following stan function call, including a seed value to keep the starting points consistent:

fit ← stan(‘schools.stan’, data = schools_data, seed = 1234)

The values for mu using the version using the "target += normal_lpdf() function are:

mu 7.90 0.12 5.25 -2.35 4.53 7.94 11.23 18.85 1855 1

where as using the “eta ~ normal(0,1)” version, they are:

mu 7.79 0.08 4.82 -1.44 4.63 7.73 10.87 17.33 3342 1

These are obviously close, essentially equivalent given the SEMs. I understand from the discussion that the difference is due to inclusion or not of a normalizing constant. The discussion that I have read suggested, however, that one form might be better for comparison of models, and that there was a minor difference in efficiencies of the two versions.

Since I am just (re)learning STAN, I would prefer to learn to use one model form consistently. I anticipate that I will be comparing models, and accuracy in those comparisons is more important to me than speed. But in general I’d like to understand better the differences between the two types of model declaration and when it would be better to use the one or the other.

Thanks in advance to anyone that can clarify this issue for me.
Larry Hunsicker

andrjohns · July 11, 2021, 2:51am

See this section of the manual for more info: 7.4 Sampling statements | Stan Reference Manual

Lawrence_Hunsicker · July 11, 2021, 2:16pm

See this section of the manual for more info: [7.4 Sampling statements | Stan Reference Manual

Thanks, andrjohns. I assume, then, that to compare models one would want the exact log probability values for the models, and the explicit incrementing (target += normal_lpdf(eta | 0, 1);) would be needed to get accurate model comparisons. The sampling statement (eta ~ normal(0,1)) might be slightly faster, and certainly more familiar to many statisticians, if model comparison is not anticipated.

Do I now understand things correctly?

andrjohns · July 11, 2021, 2:24pm

It depends what you mean by ‘compare models’. If you’re using things like LOO or Bayes factors then you need the additional constants, but if you’re just interpreting the returned parameter values without any statistical comparison then the sampling statement is fine

Lawrence_Hunsicker · July 11, 2021, 3:43pm

Yes. I was thinking of loo. So I’ll use explicit incrementing whenever I intend to do model comparison. I’m not sure that this issue is prominently mentioned to folks that are used to using the sampling statement.

Along the way, it sounded to me as though the sampling statement was equivalent to

target += normal_lupdf(eta | 0,1):

Looking further in the Stan Functions Reference, I confirmed this.

16.1.5 Sampling statement

y ~ std_normal ()
Increment target log probability density with std_normal_lupdf(y)

But when I tried to run the model with lupdf(y) in place of lpdf(y), the model compiler balked with the error:

SYNTAX ERROR, MESSAGE(S) FROM PARSER:
error in ‘model435452bf44f0_schools3’ at line 19, column 24
17: }
18: model{
19:   target += normal_lupdf(eta | 0, 1);
                           ^
20:   target += normal_lupdf(y | theta, sigma);
PARSER EXPECTED: “(”
Error in stanc(file = file, model_code = model_code, model_name = model_name, :
failed to parse Stan model ‘schools3’ due to the above error.

What am I missing here?

Another related question. While the parameter estimates for the two models are quite similar, the effective sample sizes are quite different (explicit incrementing: 1855; sampling: 3342. I guess that the reduced effective sample size for the explicit incrementing relates to uncertainties in the normalizing constant. The difference in effective sample size would be quite important for powering a new trial. Or does this only relate to how many draws one needs to get “convergence”? How should I understand this difference?

andrjohns · July 13, 2021, 7:00am

That error is because the _lupdf extensions aren’t yet available in rstan (it’s a few versions behind).

The effective sample size is driven more by your model parameterisation than by your actual sample size. ESS essentially tells you how many independent draws would give you the same amount of posterior accuracy as the current draws. So if you have highly auto-correlated samples from the posterior, then they won’t be providing very much ‘new’ information compared to independent samples. Consequently, you’ll have a low effective sample size.

This means that the lower effective sample size for the ‘explicit incrementing’ indicates a higher level of autocorrelation between iterations, rather than anything to do with the (observed) sample size itself.

@avehtari does that all sound right? Is the difference in ESS between the models with and without normalising constants expected?

Lawrence_Hunsicker · July 13, 2021, 12:01pm

Thanks, andrjohns.
I’m puzzled by the difference in effective sample size, given that the only difference between the two models is the use of explicit updating vs sampling statements. The parameterization is otherwise identical. Does adding in the normalizing constant affect the autocorrelation of the draws?

I’ll see whether aki appends a comment.

jsocolar · July 13, 2021, 6:17pm

You don’t need to use normalized sampling statements for loo; you just need to ensure that you include the normalizing constants when you compute the pointwise log likelihood (usually in generated quantities). See here for more:

You do need to use normalized sampling statements if you want to work with Bayes factors, however.

This is very surprising to me. Can you provide a reproducible example? Can you confirm that the difference is consistent across runs, and that the runs are using the same number of chains run for the same number of iterations?

avehtari · July 21, 2021, 11:22am

btw. you can install rstan version 2.26.2 (using Stan version 2.26.1, which includes e.g. _lupdf) from https://mc-stan.org/r-packages/ with command

install.packages("rstan", repos = c("https://mc-stan.org/r-packages/", getOption("repos")))

Lawrence_Hunsicker · July 22, 2021, 3:35pm

The update fixed my problems. Now the effective sample size using either explicit updates with normal_ lupdf are identical, and quite close to explicit updates with normal_lpdf.

Thanks, aki and andrew, and thanks also to jacob for the distinction between using loo and working with Bayes factors, and for the pointer to Aki’s discussion of writing Stan for use with the loo package.

Larry Hunsicker

Topic		Replies	Views
Eight Schools example: Questions General specification	1	331	March 25, 2024
Differences between ~normal() and target += normal_lpdf() Modeling	13	6681	March 13, 2018
Getting different results when using target += Modeling rstan , fitting-issues	3	83	December 12, 2024
Difference between ~ and target+= General specification	1	588	July 28, 2020
Initialization differences between normal and logistic IRT Modeling	3	711	June 8, 2018

Target += normal_lpdf(eta | 0, 1); vs. eta ~ normal(0,1)

16.1.5 Sampling statement

SYNTAX ERROR, MESSAGE(S) FROM PARSER: error in ‘model435452bf44f0_schools3’ at line 19, column 24

Related topics

SYNTAX ERROR, MESSAGE(S) FROM PARSER:
error in ‘model435452bf44f0_schools3’ at line 19, column 24