I am pretty much a stan newbie, but I have found differences in output between various versions of the schools model depending on whether one specified the models using (e.g.)
target += normal_lpdf(eta | 0, 1);
vs.
eta ~ normal(0,1)
The full code from the rstan::rstan vignette is:
data {
int<lower=0> J; // number of schools
real y[J]; // estimated treatment effects
real<lower=0> sigma[J]; // s.e. of effect estimates
}
parameters {
real mu;
real<lower=0> tau;
vector[J] eta;
}
transformed parameters {
vector[J] theta;
theta = mu + tau * eta;
}
model {
target += normal_lpdf(eta | 0, 1);
target += normal_lpdf(y | theta, sigma);
}
Prior versions are the same except for the final model section:
model{
eta ~ normal(0, 1);
y ~ normal(theta, sigma);
}
I invoked the two versions using the following stan function call, including a seed value to keep the starting points consistent:
fit â stan(âschools.stanâ, data = schools_data, seed = 1234)
The values for mu using the version using the "target += normal_lpdf() function are:
mu 7.90 0.12 5.25 -2.35 4.53 7.94 11.23 18.85 1855 1
where as using the âeta ~ normal(0,1)â version, they are:
mu 7.79 0.08 4.82 -1.44 4.63 7.73 10.87 17.33 3342 1
These are obviously close, essentially equivalent given the SEMs. I understand from the discussion that the difference is due to inclusion or not of a normalizing constant. The discussion that I have read suggested, however, that one form might be better for comparison of models, and that there was a minor difference in efficiencies of the two versions.
Since I am just (re)learning STAN, I would prefer to learn to use one model form consistently. I anticipate that I will be comparing models, and accuracy in those comparisons is more important to me than speed. But in general Iâd like to understand better the differences between the two types of model declaration and when it would be better to use the one or the other.
Thanks in advance to anyone that can clarify this issue for me.
Larry Hunsicker