Hello,
I would like to apply survey weights with brms in such a way that standard errors are estimated correctly. To achieve this, I’m trying to replicate the approach showed in this post by @JimBob - that is, dividing the survey weights by general design effect. However this doesn’t lead to results that would agree with survey package.

My benchmark is the following model:

library(survey)
data(api)
apistrat_design <- svydesign(ids = ~1, data = apistrat, weights = ~pw) #Ignore strata for now
m1 <- svyglm(api.stu ~ 1, design = apistrat_design) #Intercept SE: 22.16

And this is my attempt at replicating the result with brms:

Unfortunately, this doesn’t work, as the standard error for intercept is still less than 1/4 the size it should be. At first, I thought that this may be because of the default priors, but using flat priors for everything didn’t change the results much.

I would be very grateful for any pointers.

(Also, I’m aware of multilevel models with poststratification, but I’d like to get survey weights working on their own, as this is how people are used to do weighting in my field.)

What happens if in brm you use ~ 1 rather than ~ 0 + Intercept?

In addition I used a function from one of the packages which I think was called general_design_effect (you can check in the code that was shared in that post) - I’m afraid I can’t check as my laptop is out of action.

I’ve checked and the design effect computation indeed seems to be the problem. Design effect computed by my function (which is based on Kisch’s formula) is 1.19, while the one computed by survey is 0.72. Still not sure why. The sum of the original weights indeed matches population total.

@AlesVomacka did the approach overall work with the correctly applied design effect computation? Am interested to see how this works out for people in practice

@JimBob sorry for a late response. Unfortunately, using the design effect computed by survey doesn’t give correct result either. For the record, here is the code:

Interesting. The approach I tested only included a very simple design, essentially just person weights without specific layers of stratification or anything like that so I wonder if that has anything to do with it, if your sampling design is in any way more complex. If you are able to share any code etc it might help see if this is resolvable or not