They both result in the same density wrt to the sampling. The difference is that normal_lpdf returns the correctly normalized density. It is thus a tiny bit slower as the normalizing constants are being calculated each time. The ~ normal notation does drop the normalizing constants.

So within MC error your results should match (and they do as it looks like).

From the example above, if the mc standard error is under se_mean, it looks like the alpha's are quite significantly different, with a difference of 7 and a mc error of 0.1?

OK - I thought I was misunderstanding the differences between these two, but given your response this might be a bug.

If you look closely, the two samples do not match. The mean for beta[4] in the first one is outside of the interval for the second. The mean for beta[1] on the first is very close to the edge of the interval for the second. With an effective size ~800, there should not be this large of a difference.

For reference, the ~normal() notation gives the same results as rstanarm::stan_glm().

My setup:

Linux 4.15.7
gcc 7.3.0
cmdstan 2.17.1
R 3.4.3
rstan 2.17.3

Sort of doesnâ€™t matter because itâ€™ll end up getting moved to either stan-dev/stan or stan-dev/math

Somebody just needs to run multiple chains, confirm that data is passed in correctly, confirm that the two are different, confirm that itâ€™s not a mixing problem from the model, check what c++ is generated, etcâ€¦

if there are actually problems with the normal log density we should hand out prizes for finding them :)

Ohâ€¦ you are right. I did not look that closely. Have you

Run the above with different seeds?

Rerun the thing with at least weakly informative priors?

Can you check the sampler diagnostics? (stepsize, masses)â€¦ if these terms donâ€™t mean too much to you, donâ€™t mind.

I would suspect that there is something odd going on wrt to the warmup and no priors are not a good thing - but you are right in that results should not differ that much.

I have tried with and without priors of varying strength, different starting seeds, different number of iterations. I pulled the priors off the example model to make it the bare minimum model necessary to see the issue.

On my machine, it is pretty consistent. That is why I thought I was not understanding the difference between these two fully.

I can put together a more comprehensive test case if that is helpful.