Why shouldn't I use weights?

I’ve been told that I shouldn’t use weights like I would in a standard GLM. In the insurance industry, each observation is weighted on the amount of risk associated with it.

Why shouldn’t Bayesian methods use weights?

Non-Bayesians use “weights” in a lot of different ways, so it is not easy to enumerate all the criticisms. In the insurance case (and in most cases), the primary objection is that the model being estimated is not fully generative if the risk associated with each observation is taken to be exogenous rather than produced by the model. Things like this would be less problematic if you had the true weights, but you only have a (point) estimate of the individual’s risk (often from some other non-Bayesian model) and the uncertainty in those estimates is not propogated through to the posterior distribution over the unknowns in the present model.

3 Likes

In a GLM setting you may think of a weighted regression as a linear regression with heteroscedastic variance. E.g. y_i \sim N(\alpha + \beta x_i,\sigma_i^2).

I’m not sure that weights are a problem so long as they can be internalised into the model. I.e. say we have data (y_i,x_i,w_i) with x_i,y_i related as above and w_i \sim N(\mu,\gamma^2), I don’t think that’d be a problem. The model would remain generative.

I guess the issue is (as @bgoodri said) if you’re plugging unmodeled weights in because then you wouldn’t end up with a posterior you can sample from; and the model would not be generative.

1 Like

What is the second argument in Wi represent? Would that be in the model as an additive variable in terms of actual STAN code? Like so or something close?

Model {
 Y ~ poisson_exp(intercept + xB1 * xB2...xBn + Wi)
} 

I apologize as I haven’t learned to use weights in this context.

The mean and variance are the same for the Poisson distribution so it’s not really an exemplar use case. I suppose you could treat weight as another additive factor as you’ve suggested, but others may be more authoritative regarding the right approach here.

This is what I learned from asking @lauren this question. It provides examples of what @bgoodri is talking about, which is that uncertainty estimates are not calbrated with weights:

https://statmodeling.stat.columbia.edu/2019/10/29/non-random-missing-data-weights-generative-model/

There is a section in the Stan user’s guide on how to implement weighted regressions.

1 Like

Thank you.