Survey weighted regression


Thanks for clarifying! My confusion about how weights are treated by rstanarm stems from the fact that ?stan_lm says that they’re treated the "same as lm".

I don’t think lm can be taking an approach analogous to normal_lpdf(y[i] | yHat[i], sigma) * weights[i] though, because lm weights are invariant to scale:

> y <- rnorm(100)
> w <- runif(100)
> fit_w   <- lm(y ~ 1, weights = w)
> fit_10w <- lm(y ~ 1, weights = 10*w)
> SE <- function(fit) summary(fit)$coef['(Intercept)', 'Std. Error'] 
> SE(fit_w) == SE(fit_10w)
[1] TRUE

Perhaps it would be clearest to say that weights in rstanarm are replication counts that treat each observation as one or more real observations (analogous to Stata’s fweights).

This would help clarify that scale very much does matter for rstanarm weights – the sum of the weights is equal to the number of observations in the original dataset.


The stan_glm and stan_lm functions do things a little differently. For most models, they are interpreted as frequency weights. For stan_lm, they can be like the weights for generalized least squares.


@Guido_Biele. Thanks for this very helpful approach. I have two questions:

(1) How could we make the relationship between the weight and the variance explicit in the model below?

The reason is that we have two types of inverse variance weights:

weight_RandomEffects = 1/(tau^2 + sigma^2)
weight_FixedEffects = 1/sigma^2

(2) Could it be correct to implement directly in a stan program the relevant equations from here, here, here, or elsewhere by adding a transformed parameters block to the model below?

For example,

transformed parameters {
vector[N] tau;  
tau = sqrt(sum(w^2[i][(y[i] - y_mu)^2 - sigma^2[i]])/sum(w^2[i]))

Thanks in advance.


I can look a bit more into this on Monday.
Generally, you can use weights in Stan as you can do it in when computing maximum likelihood estimates. (Though not everyone agrees one should)
The reason I’m am hesitant in my answer is, that I am unsure about what the sigma is in the inverse variance weights. I assume it’s the sigma of the effect sizes, and I mm addition one estimates an error variance, but I can’t tell without access to the paper. (I am also not sure what tau is).

As an aside, you can use brms to estimate meta analysis models. See for example here:


Great, @Guido_Biele. Here, we have two sources of variability in the effect sizes of the primary studies:

tau^2 = between-studies variance
sigma^2 = within-study variance

tau can be calculated using the formula shown above in transformed parameters block.

sigma is calculated as follows:

These variances are then used to estimate the inverse variance weights:

Thanks in advance.