Keeping the draw order consistent with `posterior_linpred` and `posterior_epred`

Ryan_Giordano · March 7, 2024, 5:21am

It appears that every invocation of posterior_linpred and posterior_epred gives a different permutation of the draws. This prevents users from doing posterior calculations (e.g. covariances) using the draws.

Relatedly, I was unable to find an example of using the XZ argument to manually compute the same thing using the raw draws, and my naive attempts do not seem to match the output of posterior_epred.

# Simulate some logistic regression data
set.seed(42)

n_obs <- 1000

num_cat <- 3
rel_min <- 0.1
cat_probs <- runif(num_cat) 
cat_probs[cat_probs < rel_min] <- rel_min
cat_probs <- cat_probs / sum(cat_probs)
x <- sample(num_cat, size=n_obs, replace=TRUE, prob=cat_probs)
y_prob <- runif(num_cat)

df <-
  data.frame(x=x, p=y_prob[x]) %>%
  mutate(y=as.numeric(runif(n()) < p))


num_draws <- 5000

# Run rstanarm

  fit <- stan_glmer(y ~ (1 | x),
                    family = binomial(link = "logit"),
                    data = df,
                    prior = normal(0, 1, autoscale = TRUE),
                    prior_covariance = decov(scale = 0.50),
                    adapt_delta = 0.99,
                    refresh = 0,
                    seed = 1010,
                    iter = num_draws)

# Then every one of these repeated commands gives a different answer each time:
  posterior_linpred(fit, newdata=df, draws=num_draws, seed=42, permuted=FALSE)[1,1]  
  posterior_linpred(fit, newdata=df, draws=num_draws, seed=42, permuted=FALSE)[1,1]  
  posterior_linpred(fit, newdata=df, draws=num_draws, seed=42)[1,1]  
  posterior_linpred(fit, newdata=df, draws=num_draws, seed=42)[1,1]  
  posterior_linpred(fit, newdata=df, draws=num_draws)[1,1]  
  posterior_linpred(fit, newdata=df, draws=num_draws)[1,1]  
  posterior_linpred(fit, newdata=df, draws=num_draws - 10)[1,1]  
  posterior_linpred(fit, newdata=df, draws=num_draws - 10)[1,1]

If possible, add also code to simulate data or attach a (subset of) the dataset you work with.

Operating System: Ubuntu 22.04.4 LTS
rstanarm Version: 2.21.4

Don’t forget to add relevant tags to your topic (top right of this form) especially for application area. Delete this text before posting your topic. Thx!

Looking forward to your topic!

Ryan_Giordano · March 8, 2024, 5:37pm

With apologies, I think I should probably clarify exactly what I want to do. How can I get draws from posterior_linpred and / or poseterior_epred for multiple different design matrices, but with the posterior draws in the same order, so I can calculate covariances between them?

avehtari · March 8, 2024, 6:57pm

Would using brms instead of rstanarm be OK?

Ryan_Giordano · March 8, 2024, 7:36pm

I think so!

For some context, I am able to do this all manually myself with bespoke models and samplers, but I’d like to build some software that’s widely useable. So all I’m really after is the ability to get these draws from a sampling package that, say, your average political scientist would feel comfortable using. Probably you know better than I do whether brms fits the bill as well as rstanarm…

avehtari · March 9, 2024, 11:08am

brms is more flexible and extensible than rstanarm, and people are building packages on top of brms, see Flocker: an R package for occupancy modeling with `brms` - #8 by Ven_Popov

Ven_Popov · March 9, 2024, 11:56am

I cannot speak how it compares with rstanarm as I haven’t used that package, but what @paul.buerkner has built with brms is just amazing and amazingly flexible. I’ve used brms for years for out of the box analysis, but I only really started to appreciate all the work that has gone into it when we started building a package with custom models that uses brms as the backend to generate the stancode and the framework for all postprocessing methods. The goal of our package is similar to your goal - to make complex models accessible to a broad base of researchers who might not have the computational background to implement them themselves. So far it seems that any model you could code in Stan you can build together with brms. Others have done similar things for other domains (see the list @avehtari linked to). So I definitely suggest trying it out. And in addition, Paul has been super responsive to making changes in brms that makes such work easier

Ryan_Giordano · April 1, 2024, 9:36pm

I can confirm that the draw ordering works with brms, so I’ll just go with that. It does seem to me that rstanarm might want to get that fixed, but thanks to brms it won’t hold me up. Thanks!

wpetry · April 2, 2024, 4:11am

It’s worth opening an issue for rstanarm to report the issue and get this behavior fixed: Issues · stan-dev/rstanarm · GitHub

Topic		Replies	Views
Posterior_predict and draws inconsequential rstanarm	3	930	August 27, 2017
Confusion on difference between posterior_epred() and posterior_predict() in a mixed effects modelling context Modeling mixed-model	6	3408	July 27, 2023
Posterior::as_draws_df() fails with stanreg object from stan_glm() rstanarm rstanarm	4	419	July 26, 2023
Use Posterior_predict in rstanarm to generate probabilites for each observation in a logistic regression model Modeling	6	1341	February 22, 2019
Constrained Optimization in Rstan RStan rstan	1	592	February 7, 2021

Keeping the draw order consistent with `posterior_linpred` and `posterior_epred`

Related topics