Weights in brm


#1

Dear Stan community,

I am using the weight option in the brm function to account for different variances in field sites in a negative binomial generalized linear mixed effect model. Using weights this way is to my understanding approximating the idea of the weighting function varIdent() from the R package nlme (as a reference may serve Galecki & Burzykowski 2013, pages 129-130, paragraph 7.3.2, page 135, paragraph 7.4.3). After considering the Stan code from the brm model, I am still trying to make up my mind, how weights are used for the posteriors. The brm code for weights in R looks as following:


b.mod <- brm(y|weights(w) ~ ...)

On the one hand, the Stan code from the brm model shows that the posteriors are simply multiplied with weights:


vector[N] weights; \\ model weights
target += weights[n] * neg_binommial_2_log_lpmf(Y[n] | mu[n], shape);

On the other hand, other R packages seem to implement weights sometimes differently. For example, the gls() function from the nlme package appears to use internally multiplicative inverse of weights, while taking as an input and showing also in the summary output “raw” weights. Again, the function lm() also differs in the implementation of weights when compared to gls(). So, the way of implementing weights in models can differ from R package to R package, which confused me as to how weights are implemented in the package brms (better to say in Stan as brm relies on Stan).

Does anybody know how brm() uses weights? Does it use internally the multiplicative inverse, which would imply that the user needs to use the “raw” form of weights (like the weight function in gls())? Or does the weights are not “transformed”, which is why the user should implement the multiplicative inverse a priori coding the model? Many thanks for time and response.

  • Operating System: ArchLinux, Linux Kernel 4.16.8-1
  • brms Version: 2.3.0

#2

It is not the “posteriors” that are weighted but the likelihood contributions of each observation.

brms takes the weights literally, which means that an observation with weight 2 receives 2 times more weight than an observation with weight 1. It also means that using a weight of 2 is equivalent to adding the corresponding observation twice to the data frame.


#3

In addition to Paul’s reply. If you want to have varIdent like behaviour, you can model the precision or variance parameters using the distributional regression approach. The first example in that vignette:

fit1 <- brm(bf(symptom_post ~ group, sigma ~ group), 
            data = dat1, family = gaussian())

Should be similar to (if my understanding is correct…):

library(nlme)
fit_gls <- gls(symptom_post ~ group, weights = varIdent(form = ~1|group), 
              data = dat1)

#4

Dear Paul, Dear Hans,
many thanks for your replies and hints.

@paul.buerkner

It is not the “posteriors” that are weighted but the likelihood contributions of each observation.

Shame on me. The code from Stan I’ve written in my initial post shows clearly what you are saying.

@hansvancalster

Your comment came in the right moment, 'cause I started to fiddle around with the Stan code to account for groupwise differences in variance following comments on Stan user mailing list and Stan development mailing list, i.e. some idea like


neg_binommial_2_log_lpmf(Y[n] | mu[weight[n]], shape);

The link and code you have provided in your post directed me towards the (hopefully) right path. At the end of Paul’s vignette, you gave the link for, there is an example of additive distributional models using multilevel data. While I am using data described well by a negative binomial distribution, I had to replace sigma by shape to make the model run (see also Paul’s description of brmsformula line 19-25).


#5

Just as curiosity and point of general discussion,

Coming from a field (genomics/biology) where weights are abused, I always thought that the beauty of Bayesian inference is that “weights” is a piece of information that has to be sought from the data, with proper modelling.

How this ad hoc imposition of the information content of a data point is compatible to Bayesian inference?

Could you provide an example where this procedure is needed/better than pure Bayesian inference?

Thanks


#6

I haven’t looked at the particulars of the model in this thread, but as a general comment, yes weighting can be incompatible with the idea of a generative model, which is a lot of the reason why many Bayesians object to it.