Posterior_predict and draws inconsequential

Operating System: windows 7
Interface Version:rstanarm (Version 2.15.3, packaged: 2017-04-29 06:18:44 UTC)

Compiler/Toolkit:

I had a model from which I meant to draw 500 samples

Speed_rep_D <- posterior_predict(gamma_all,newdataD,draws=500)
length(Speed_rep_D)
[1] 482000

but happened to get much more - couldn’t find that explained in the help fpr posterior_predict.

By the way, here I had one factor with three levels and two random components and really were interested in comparing the posterior_predicts corresponding to the factor levels - In WinBUGS I might use “step” to see the difference directly, here I get predicted draws corresponding to the factor levels and compare outside of RStan - that cannot be the right thing to do?

Best wishes
Troels

The length function returns the number of elements in a matrix, so in this case, you get 500 draws (the rows of Speed_rep_D) from the posterior predictive distribution for each of the 964 observations in newdataD (the columns of Spped_rep_D).

I don’t understand your second question. Are you saying that you did a model like gamma_all <- stan_glmer(y ~ a + (1 | b) + (1 | c)) where a is a factor with three levels, while b and c are different factors? If so, you can construct a data.frame that has any combination of the levels of a, b, and c that you want and pass that to the newdata argument of posterior_predict.

Thanks a lot - I see how the length is explained, sorry about that. You
understood the second question right, so was my model, but with three
levels of factor a how would I specify a contrast say between level 2
and 1 to pass to the posterior_predict? I took my whole data and made
individual predictions for each level of a across everything else, and
so got 3 set of predicted values and commpared them. Just thought that
was ugly.

Best wishes

Troels

If a, b, and c were the only variables to worry about, you could easily do

df <- expand.grid(a, b, c)

to get a data.frame with every unique combination of the levels and pass that to posterior_predict. If there are additional covariates, then you have to choose what values of those to predict with. You will see people often set continuous variables equal to their sample means, but I tend to do conditional means. You can get those with

with(dataset, by(x, list(a = a, b = b, c = c), FUN = mean))