How to pass multiple columns of weights to brms

#1

I would appreciate any help to specify my brms model below in order to be able to pass multiple columns of weights to the model as illustrated in the stan code below.

I need to do this in brms or stanarm rather than stan directly because I want to use functions of https://github.com/mjskay/tidybayes that are currently not supported by a stanfit object.

#sample data:

dt = read.table(header = TRUE, text = "
n r r/n group treat c2 c1 weights
62 3 0.048387097 1 0 0.1438 1.941115288 1.941115288
96 1 0.010416667 1 0 0.237 1.186583128 1.186583128
17 0 0 0 0 0.2774 1.159882668 3.159882668
41 2 0.048780488 1 0 0.2774 1.159882668 3.159882668
212 170 0.801886792 0 0 0.2093 1.133397521 1.133397521
143 21 0.146853147 1 1 0.1206 1.128993008 1.128993008
143 0 0 1 1 0.1707 1.128993008 2.128993008
143 33 0.230769231 0 1 0.0699 1.128993008 1.128993008
73 62 1.260273973 0 1 0.1351 1.121927228 1.121927228
73 17 0.232876712 0 1 0.1206 1.121927228 1.121927228")

N <- nrow(dt)
n <- dt$n
r <- dt$r
p <- dt$r/n
group <- dt$group
treat <- dt$treat
c1 <- dt$c1
c2 <- dt$c2
w_1 <- dt$weights
w_2 <- dt$weights - 0.01
w_3 <- dt$weights + 0.01/2
w_4 <- dt$weights - 0.01/3
w_5 <- dt$weights + 0.01/4
w_6 <- dt$weights - 0.01/5
w_7 <- dt$weights + 0.01/6
w_8 <- dt$weights + 0.01/7
w_9 <- dt$weights + 0.01/8
w_10 <- dt$weights + 0.01/9

list_bind <- list (N = N, 
          n = n, r = r, p = p, group = group, treat = treat, c1 = c1, c2 = c2,
          weights = cbind(w_1, w_2, w_3, w_4, w_5, w_6, w_7, w_8, w_9, w_10)
          )

dt_bind <- as.data.frame(list_bind)

#my attempt:

m <-brm(r | trials(n) + weights(weights.w_1:weights.w_10) ~ treat*c2+(1|group), 
              data=dt_bind, family=binomial(link=logit))

#stan code:

//this is what I want the brms model specification to be able to do

data { 
...
real<lower=0> weights[N, 10];  // data block of model weights 
} 

model { 
...
// likelihood 
for (n in 1:N) 
for (w in 1:10) {
target += weights[n, w] * binomial_logit_lpmf(Y[n] | trials[n], mu[n]);
}
} 

This question has also been posted here. Thanks in advance for any help.

0 Likes

#2

Please avoid openening multiple threats for the same issue just because you don’t immediately receive an answer. Neither brms nor rstanarm support multi column weights and I don’t really see the purpose of that.

0 Likes

#3

I don’t understand why you are doing this, but unless I’m misunderstanding completely you could perhaps flatten your dataframe into something with n*w rows and a single weights column.

0 Likes

#4

Thanks, @mjskay, for this very helpful insight. I am doing this to account for Uncertainty in the Design Stage.

If I am understanding correctly your explanation, the solution to my problem would be to change the structure of the weights from wide to long format with this:

library(tidyr)

#convert to long format
dt_bind <- tibble::rowid_to_column(dt_bind, "id")
dt_bind$id <- factor(dt_bind$id)
dt_long <- gather(dt_bind, draw, weight, weights.w_1:weights.w_10, factor_key=TRUE)

head(dt_long, 12)
id  N   n   r          p group treat       c1     c2        draw   weight
1   10  62  3 0.04838710     1     0 1.941115 0.1438 weights.w_1 1.941115
2   10  96  1 0.01041667     1     0 1.186583 0.2370 weights.w_1 1.186583
3   10  17  0 0.00000000     0     0 1.159883 0.2774 weights.w_1 3.159883
4   10  41  2 0.04878049     1     0 1.159883 0.2774 weights.w_1 3.159883
5   10  212 170 0.80188679   0     0 1.133398 0.2093 weights.w_1 1.133398
6   10  143 21  0.14685315   1     1 1.128993 0.1206 weights.w_1 1.128993
7   10  143 0 0.00000000     1     1 1.128993 0.1707 weights.w_1 2.128993
8   10  143 33 0.23076923    0     1 1.128993 0.0699 weights.w_1 1.128993
9   10  73  62 0.84931507    0     1 1.121927 0.1351 weights.w_1 1.121927
10  10  73  17 0.23287671    0     1 1.121927 0.1206 weights.w_1 1.121927
1   10  62  3 0.04838710     1     0 1.941115 0.1438 weights.w_2 1.931115
2   10  96  1 0.01041667     1     0 1.186583 0.2370 weights.w_2 1.176583

Is that right?

Assuming I am getting it right, how would I then account in my brms or rstanarm model for the fact that each id has a distribution of weights represented by the variable draw, rather than a single weight?

Thanks in advance.

0 Likes

#5

Doesn’t that approach (unless I’ve missed something) turn something like this:

model { 
  ...
  // likelihood 
  for (n in 1:N) 
    for (w in 1:W) {
      target += weights[n, w] * binomial_logit_lpmf(Y[n] | trials[n], mu[n]);
    }
  }
}

Into the equivalent:

model { 
  ...
  // likelihood 
  for (k in 1:N*W) 
    target += weights_prime[k] * binomial_logit_lpmf(Y[n] | trials[n], mu[n]);
  }
}

Where (say) weights[n,w] = weights_prime[N*(w - 1) + n]. So if the first model is doing what you wanted the second should also be doing what you want?

I’m not sure where to go from there; it sounds like the folks in that other thread have a better idea of what to do with a model like this.

1 Like

#6

Thanks, @mjskay. This is a very smart way to address the problem.

Still, because brms and stanarm only accept a single column of weights, only a single column of weights must be declared in the data block. Then I could use my updated Stan model via the update command as suggested by @Guido_Biele there.

In my new dataframe data_long the former values of the columns weights.w_1 to weights.w_10 - representing the distribution of weights per observation - are now linked by the same value of the variable id.

That needs to be accounted for in the data block :

data { ...
 real<lower=0>; weights[N, 10]; // data block of model weights in wide format
} 

where real<lower=0> weights[N, 10] needs to be converted into real<lower=0> weights_prime[k].

This k here could be a new variable defined in the dt_long representing the values of the variable weights for each value of the variable id. Not certain how to define that variable. Probably transformed data block could be the right place to do these things.

Thanks in advance for any help.

0 Likes

#7

Ah, the point I was trying to make is that weights_prime is the same as the single column of weights you created by making the data frame into a long format. So I would have thought that already solved your problem as far as the model specification with brms or rstanarm is concerned (but not the post-processing part as I understand it).

1 Like

#8

@mjskay, indeed your solution addresses my problem as far as passing multiple columns of weights to brms or stanarm is concerned. Thank you very much for this very smart way of solving the problem.

Now I just need to do the next step:

Thanks for this.

0 Likes