Convergence issues with sampling weights - rstanarm

dkaplan · July 5, 2022, 5:59pm

Hi all, I want to ask another question about the use of sampling weights particularly with rstanarm. I have a very simple random intercept model that I’m running and it runs like a charm. When I add the sampling weights that are supplied by the survey, the program slows down dramatically, which is not surprising, but the problem is that I can’t get convergence of either the random intercept or its standard deviation. Rhat is large and n_eff is much smaller than it should be. As a result, the program is throwing the bulk and effective sample size error messages and suggesting many more iterations (i’m using 15K with 4 chains and 10 thinning).

To be clear, I fully concur with the issues that were raised in previous posts about the problems with sampling weights and the violation of the likelihood principle. The reason I’m trying to add the weights is a long story, but suffice to say that the purveyors of this large (and very policy important) survey (Organization for Economic Cooperation and Development) are very interested in what a Bayesian approach could provide but would find it hard to swallow a Bayesian approach if sampling weights were not included. I’m trying to convince them otherwise. So, my question is whether there are any tricks of the trade in stabilizing the weights in order to get convergence of the random intercept and its standard deviation.

Thanks,

David

Bob_Carpenter · July 26, 2022, 9:48pm

I don’t know how to get at the model that rstanarm uses here.

That’s the same situation @andrewgelman and previous postdocs ((@Yajuan_Si, @Lauren) were in at Columbia with the School of Social Work. I don’t know what they used to fit, but maybe they’ll have some advice. Sharing an office with them, I can vouch for the fact that the fitting was non-trivial.

lauren · July 29, 2022, 1:07am

Hi folks,

Perhaps one useful tip that I’ve used before (e.g., in https://osf.io/preprints/socarxiv/3v5g7/) is to normalise the weights so that they have a mean of 1/sum to the sample size.

Rstanarm/brms use these weights as literal frequency weights (e.g., if you have weight of 2 rstanarm will interpret it as have literally 2 identical observations). This could potentially cause some funky convergence issues, along with ultra precise uncertainty intervals.

dkaplan · August 2, 2022, 5:50pm

Thanks for all the assistance. I found that using normalized sampling weight so that they sum to the sample size seems to stabilize things nicely.

andrewgelman · December 30, 2022, 5:37pm

As Bob says, the problem is nontrivial!

One reason is that the concept of “sampling weights” is not uniquely defined. Different analyses are appropriate in different settings. Here are some relevant papers:

We’re still trying to figure out good general-purpose plug-and-play solutions to inference with survey weights.

Topic		Replies	Views
What are the "weights" in rstanarm rstanarm	20	2233	November 21, 2020
Weights in stanarm rstanarm	0	575	December 8, 2018
Low ESS and High RHat for Random Intercept & Slope Simulation (rstan and rstanarm) Modeling	8	2827	August 2, 2019
Efficient sampling SRM with large number of correlated random effects Modeling	1	731	May 4, 2018
Survey weighted regression Modeling	34	9161	May 27, 2022

Convergence issues with sampling weights - rstanarm

Related topics