I’m recieving slightly different estimates from the pooled chain output from the summary() function, summary(fit)$summary, compared to using monitor(as.array(fit), digits_summary = 7).
I might be mistaken in expecting these two outputs to be the same but can’t think of any obvious reason that they wouldn’t be (the monitor() method returns calculated Rhat values so must also be using all chains pooled).
Differences are minor but enough to make me wonder what’s going on. As an example, below is the output from the 8schools model for mu.
Figured it out, right line of thinking Stephen but slightly different issue.
as.array() takes the number of warmups from the stanfit object and drops this number from the start of the chain (no arguments to change this behaviour for as.array). monitor() by default drops the first half of the chains fed to it. So using default arguments (and a warmup of n_samples/2), half the samples are (correctly) droppped by summary(), but 75% of samples are (incorrectly) dropped using monitor(as.array()).
Easy fix is to set the monitor function to remove 0 warmup samples as below. Turns out the as.array() is entirely superflous here and just using monitor directly on the stanfit object will remove the correct number of samples (shown at the bottom).
This gets results very close but with some remaining funny behaviour. As seen above all estimates are equal for monitor() and summary() except se_mean and n_eff which still differ slightly (this is the case for all model parameters).
se difference is at the third decimal place and n_eff difference is similarly small but it would be nice to know where the difference is coming from. Similarly, is there a preferred function for extracting mean, quantiles, etc. from stanfit objects? I’ll default to summary() for the time being as it’s the more widely used.
Looks like the issue has been hit on before here if you look at the output from print() and the m2 output, but without resolution.