Brms sampling/speed errors in multi-level model

Hi all,

I’m having some trouble getting the Mr part of a MrP model to fit efficiently with brms. I’m hoping someone may be able to help guide me toward a better fit.

Here is the R code for the model:

vote_model <- brm(mvbind(vote_trump,vote_clinton) ~ 1 +
                    # pop level
                    state_clinton + state_black_pct + state_hispanic_pct + state_white_protestant + 
                    # group effects
                    (1|sex) + (1|race) + (1|age) + (1|edu) + (1|state_name) + (1|region) +
                    # interactions b/t group effects
                    (1|race:edu) + (1|age:edu) + (1|race:region),
                  data = cces,
                  family = bernoulli(link = "logit"),
                  # stan stuff
                  iter=1000, chains=1, cores=1, warmup=500,
                  algorithm='sampling',
                  save_model = 'output/stan_model_code.stan', 
                  # priors (https://github.com/stan-dev/stan/wiki/Prior-Choice-Recommendations)
                  prior = c(set_prior("normal(0, 1)", class = "Intercept"),
                            set_prior("student_t(3, 0, 4)", class = "sd",resp=c('votetrump','voteclinton'))),
                  # help sampling algorithm converge (http://mc-stan.org/misc/warnings.html)
                  control = list(adapt_delta = 0.95,
                                 max_treedepth = 10),
                  seed = 20190605,
                  refresh = 1)

I let it run for about 12 hours and then gave up, after which it had only completed 200 iterations. The dataset is just 40k survey respondents, so that doesn’t seem right.

Help?

Elliott

Session info:

R version 3.5.2 (2018-12-20) -- "Eggshell Igloo"
Copyright (C) 2018 The R Foundation for Statistical Computing
Platform: x86_64-apple-darwin15.6.0 (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

Welcome to Stan discourse! When asking brms related questions, please use the Interfaces - brms tag. I have changed it for you this time.

From quickly looking at your model, I have a few observations:

(1) Is the model really intended to be multivariate as implied by mvbind(…). You may want to start with one single outcome variable to get a feeling of how well it works.

(2) You included so many group-level terms that you are probably overfitting the data. For instance, is age really a grouping variable in your data or rather a continuous variable?

(3) mulitlevel bernoulli models often require a very long time at the start before they continue to sample faster. Having to wait 12 hours for 200 iterations may not imply that you need 5 times the time for 1000 iterations but perhaps only twice as much.

1 Like

Hey Paul,

Thanks for adding the proper tag. Will remember.

For the sake of trying to get this to work, I’ve changed to a single outcome model and removed the (1|age) and (1|age:edu) group terms, though to be sure, both will be necessary for the final approach.

I’m still not too pleased. The gradient evaluation stipulated 343.33 seconds for 1k transitions, and after waiting all night it had just barely completed 400. Surely that’s too long? Should I be looking around under the hood?

Honestly, I am not surprised it takes that long. You may need to be more patient, or, if you want to get a quicker look at how well the model works, use subsampling and then only run analysis on the full sample once you are happy with the model.

Well, if you’re not surprised, I guess I shouldn’t be either :) Thanks.

Collapsing the data into counts and modeling it as a binomial (instead of bernoulli at the individual level) could be faster? I’m also running MRP with a full year of CCES and it only takes ~2 hours, although I don’t have as nearly as many covariates as you do.

By counts, something like

cces_counts <- cces %>% 
    group_by(state_name, state_black_pct, state_hispanic_pct, state_white_protestant, sex, race, age, edu) %>% 
    summarize(trials = n(),
              clinton = sum(vote_clinton))

Good suggestion. I’m running this via the ‘categorical’ family and with even more covariates at just under 12 hours now. Turns out the biggest issue earlier was a data error that messed up a factor variable and prevented efficient convergence.