Bayesian parallels of weighted regression

Just wanted to quickly add: weighting the target by an integer w[n] (what you call “target hacking” and also how weights are implemented in brms) is exactly the same as putting the same datapoint w[n] times in the data. I think this intuition generalizes to any non-negative real w[n], so if you reduce the weight on old observation you take them as “half a data point”. I can imagine you could find theoretical justification to choose your weighting scheme with that in mind.

I agree with @bgoodri that a fully generative approach with parameters changing over time should be your first consideration, but I have sympathy for @8one6’s concerns about model complexity and runtime. (though in my experience, splines can be very efficient). I don’t think weights are necessarily a terrible solution - I would actually guess that some forms of weights increasing over time could yield identical inferences about the parameters at the latest time point as some time series structure of the parameters with a fixed “flexibility”. (though this is just a conjecture and I don’t pretend to posses the math-fu necessary to even begin to tackle this question).

Practically, going forward, I think you can either try brms which has a ton of features to model splines/gps/time series, reducing the coding burden substantially. Alternatively, you may want to stick to weights, but you should be aware that this is a tricky territory and extra caution should be taken - at the very least, I would do some leave-future-out cross-validation and see how the inferences change under multiple plausible weighting schemes.

Best of luck with your model!

3 Likes