Are complex surveys feasible in brms?

Disclaimer: I am continuing the conversation to learn myself.

OK, so svyglm uses a hierarchical model under the hood to account for homogeneity within strata sample?

I am asking because in your post the estimated fixed effects of the lmer model (hierarchal design + weights) and svyglm model are not the same. Maybe what is going on is that the svyglm and brms results are more similiar because brms does a better job at estimating the variance of the random intercept*? Maybe svyglm estimates the hierarchical model (if it uses one) differently than lmer?

  • ML estimates of random effects variance are biased towards 0, which is consistent with the higher value in brms compared to lmer. But I don’t know if this explains the difference in the fixed effects.

no, svyglm uses the weighted likelihood and clustered (sandwich) standard errors. It also corrects the degrees of freedom for tests to (usually) be the number of strata vs the sample size.
The idea of using the hierarchical model is basically an alternative perspective on controlling for the effects of the survey design (non-randomness)

1 Like

Hi Esteban,
We recently published an article using brms() with the DHS data for Ethiopia, and I’m happy to share our code with you if you like. Our basic approach was to do:

options (mc.cores=parallel::detectCores ())

mdatall$weight <-mdatall$v005/1000000 Per the user manual

fit1b<-brm(outcome|weights(weight)~scale(x1)+(1|region)+(1|strata), mdatall, family=bernoulli, chains = 2, cores = 2, iter=4000, warmup = 1500)

Paper is open access here


Per Esteban’s original question: I doubt that there is an implementation of design-adjusted, model-based brms in the simple way that it is using the design-based survey models. My understanding is that design-based estimators do not assume an underlying distribution of values across design variables and so are “automatic” in a sense. Per Roderick Little, “Design weighting… has the virtue of simplicity, and by avoiding an explicit model it has an aura of robustness to model misspecification” (“Comment: Struggles with Survey Weighting and Regression Modeling” 2007, p. 1).

By contrast, model-based analyses require assumptions about the relationships between the outcome and design variables. It requires a specification of a model for the design variables. Currently, there is no single model that can be implemented to give the “best” estimates in all cases that can automatically be applied. See “Bayesian Nonparametric Weighted Sampling Inference” by Si, Pillai, and Gelman (2015) for a comparison of different models for the sampling weights.

Guido stated that weighting the likelihood (thus, a pseudolikelihood) is “not considered fully Bayesian.” This is because it is not “generative,” i.e. it is not “a proper probability distribution for the parameters” (Bayesian Data Analysis, Gelman et. al. 3rd ed. p. 336). Bob Carpenter wrote up a nice illustration of this issues. One solution is to estimate parameters within cells (ideally with shrinkage) and post-stratify on known cell counts (MRP). The alternative is to model the relationship between the design variables and outcome (see again the article by Si, Pillai, and Gelman, above; or Chapter 8 of BDA).

And, as Corey’s RPubs example shows, weighting the likelihood doesn’t produce the same standard errors as the design-based survey estimates. Instead, it produces basically the same standard errors as the model-based frequentist analysis in lmer. In that way, weighting is not sufficient to get wide enough standard errors and accurate coverage.

Note that the scale of the weights matter when estimating models in Stan. This is not true of frequentist analyses other than population totals (see page 100 of “Applied Survey Data Analysis” by Heeringa, West, and Berglund, 2017) This is important to know in general, since weights oftentimes represent number of units in the population. In Corey’s example, the weights are scaled to sum to N prior to estimation in brms, so they are similar to the the lmer results.

One “non-Bayesian” workaround may be to estimate a design-effect using the survey package and then scale the weights down according to the design effect (e.g. if the design effect is 2, have the weights sum to N/2). This is in theory similar to the post-estimation adjustment approach in frequentist methods, where you scale the standard errors according to the design effect. From experience, this produces similar results to estimates based on the survey package, but it never perfectly replicates the estimates. I haven’t seen any mathematical proofs of this working, nor have I seen it suggested elsewhere. So proceed with caution.


Hi all,

First, let me thank you all the time and effort all of you put into the thread. I have tested several approaches getting similar estimates between lmer - brms - INLA, but not similars.e. as pointed by Simon:

… weighting the likelihood doesn’t produce the same standard errors as the design-based survey estimates. Instead, it produces basically the same standard errors as the model-based frequentist analysis in lmer . In that way, weighting is not sufficient to get wide enough standard errors and accurate coverage.

Although I don’t wanna be obsessed with MLM+brms and picky with the standard error. After 2 yrs of my PhD dealing with those DHS and being happy with survey package and the conservative s.e. of design surveys, I strongly think that MLMs might be a good fit to evaluate random effects of population-level covariates (not only geospatial) with the nested structure of DHS datasets (DHS survey locations are nested within Admin1-level, which are themselves nested within countries). In addition, more and more population health datasets are design-based (e.g. nhanes, brfss), and this might be extended to several problems. So that’s why I am still curious regarding what seems to be an open question in several forums.

As recommended by Skinner and Wakefield [URL]. I am also aware of Cory’s point of view about adding them as fixed effects, I will check your paper.

In the meantime, I am still using survey+poststratification but any advice is welcome.

Best regards,

1 Like

Just for posteriority, this was an interesting read:

Also see the last sentence 🙂


Very good read, especially if people are coming from disciplines where weighting isn’t common. In social sciences, all surveys that are collected by national/federal sources have weights so the sample can be representative of the population, that’s what the original post was related to. Problem is that a lot of traditional statistics students never see this kind of thing in their studies, so we have confusion around these topics.

This area (Bayesian stats on complex survey samples) is very near and dear to my heart and i’d be happy to continue the discussion about making correct inference from models that use these data in a Bayesian modeling paradigm.

1 Like

I’m down for a Bayesian complex survey discussion group if people are! I’ve been interested in this topic for a while, doing a calculation here and there and it’d be nice to have people to chat to about this. Tagging @lauren and @jonah as others are already on this thread.

@lauren runs a great MRP discussion group via zoom every Wednesday. The topics covered include complex survey topics in general (not exclusively MRP). I recommend joining the group, which just entails reaching out to @lauren for an invite (all are welcome). In case @lauren doesn’t see the post here, I know other people have reached out to her via twitter (@jazzystats) to get the link to the group.


Hi folks,

Yes we do run a reading group, where we cover surveys, Bayesian inference and the like. You’re all welcome, the current method is to reach out to me or Rohan Alexander (UofT), and we send you an email to the google group.
Sorry I missed the topic tag initially, broadly the answer is no - brms won’t do the same thing with weights when compared to a design based survey in the survey package, for the reasons outlined earlier. Also agree with Lumley - Bayesian stats, random effects and sampling design/weights really hurt my head!


No! I absolutely not, I had no intention of doing that, i’m contacting both off this post

As there seems to be some interest in this, I’ll also point to Roderick Littles work on “Calibrated Bayes”, which seems highly relevant. (I just stumbled over this while procrastinating.)

Edit: OK, most of you might already be aware of that work …


Hi all,

Let me share with you this fresh report from USAID (august 2020) about doing MLM in their complex surveys to enrich the discussion for brms:

Multilevel Modeling Using DHS Surveys: A Framework to Approximate...

Briefly, It seems suitable for WeMix package but I will give a try



There’s a lot of great discussion here! I recently posted a review article on one-level models for survey data: (Pseudo) Bayesian Inference for Complex Survey Data

For multi-level models, there is an additional level of complexity if the sampling design is also multi-stage and the sampled units correspond to entities in the model hierarchy. For example sampling schools then students or hospitals then patients. There are competing approaches (are we surprised?!) but they all seem to need the sampling weights for each stage, which is often not available. This challenge is well known in the survey literature, but most approaches are not satisfactory. There’s a lot of potential to taking a Bayesian or (Pseudo-Bayesian) approach. We have a couple pre-prints in this space that might be interesting.

The general approach with perhaps some theory missing

Some more theory for a super simple model

Both papers do review some of the competing methods so take a look!

We’ve been using Stan (of course!) but with multiple weights, we are using custom models and Rstan instead of using packages like brms.


I wanted to add to the discussion a “new” (newly published, but drafts have seemingly been available for a while) article: “Bayesian Hierarchical Weighting Adjustment and Survey Inference” and the accompanying code. My reading of the article is that it gets toward a solution that @maurosc3ner was hoping for: a relatively “automatic” model for survey inference, though one that requires population counts for post-stratification cells. @jonah is one of the authors, so feel free to correct me if I’m misunderstanding the implication of the model.

The paper describes a model for cell means in a regression model, but it seems like you should be able to extend the same model for global-local shrinkage to cell-specific regression coefficients as well.


Based on this Si et al paper, it seems like MLM modeling for these survey designs should be possible. Has an implementation for brms been developed since this paper came out?

@maurosc3ner I’ve been sitting on this paper for a while but just now thinking to mention it here: Fully Bayesian Estimation under Dependent and Informative Cluster Sampling by Luis G. Leon-Novelo and Terrence D. Savitsky.

I believe it checks the boxes

  1. fully Bayesian
  2. estimates regression parameters
  3. only requires cluster indicators and weights for the observed data (not for the entire population of clusters)

I’m not aware of a publicly available implementation of the model but I believe you could code up a function that works identically to the survey package but with Stan as the backend.


Hi Corey,
I read the conversations on this post with much interest and I stumbled upon the paper and codes you shared here. I extensively work with DHS data and now planning to move to Bayesian methods for my all modeling. I found your approach most useful. I plan to implement multinomial logistic regression using the BRMS package and was hoping to review your codes for the paper you have mentioned here. I am writing a separate email to you requesting the same. Hope you don’t mind.

FYI, I am a public health researcher with only applied knowledge of the BRMS and Stan and hoping to learn from those who have a deeper understanding of both.

@tds151 is an author and could probably share the Stan model and code

1 Like


Currently, I am working on survival rather than surveys, but I will get some extra time, save this info and test it.