Hi all, I want to ask another question about the use of sampling weights particularly with rstanarm. I have a very simple random intercept model that I’m running and it runs like a charm. When I add the sampling weights that are supplied by the survey, the program slows down dramatically, which is not surprising, but the problem is that I can’t get convergence of either the random intercept or its standard deviation. Rhat is large and n_eff is much smaller than it should be. As a result, the program is throwing the bulk and effective sample size error messages and suggesting many more iterations (i’m using 15K with 4 chains and 10 thinning).
To be clear, I fully concur with the issues that were raised in previous posts about the problems with sampling weights and the violation of the likelihood principle. The reason I’m trying to add the weights is a long story, but suffice to say that the purveyors of this large (and very policy important) survey (Organization for Economic Cooperation and Development) are very interested in what a Bayesian approach could provide but would find it hard to swallow a Bayesian approach if sampling weights were not included. I’m trying to convince them otherwise. So, my question is whether there are any tricks of the trade in stabilizing the weights in order to get convergence of the random intercept and its standard deviation.
I don’t know how to get at the model that rstanarm uses here.
That’s the same situation @andrewgelman and previous postdocs ((@Yajuan_Si, @Lauren) were in at Columbia with the School of Social Work. I don’t know what they used to fit, but maybe they’ll have some advice. Sharing an office with them, I can vouch for the fact that the fitting was non-trivial.
Perhaps one useful tip that I’ve used before (e.g., in https://osf.io/preprints/socarxiv/3v5g7/) is to normalise the weights so that they have a mean of 1/sum to the sample size.
Rstanarm/brms use these weights as literal frequency weights (e.g., if you have weight of 2 rstanarm will interpret it as have literally 2 identical observations). This could potentially cause some funky convergence issues, along with ultra precise uncertainty intervals.
One reason is that the concept of “sampling weights” is not uniquely defined. Different analyses are appropriate in different settings. Here are some relevant papers: