How to update a brmsfit object with a modified brms-generated stan model marginalizing over a distribution of weights

I would appreciate any help to update my brmsfit object with a modified brms-generated stan model because I want to pass various columns of weights to the likelihood in a way that brms does not support yet probably.

My objective is to obtain the posterior distribution of effects after marginalizing over the distribution of weights as discussed here.

I need to do this in brms or stanarm rather than stan directly because I want to use functions of GitHub - mjskay/tidybayes: Bayesian analysis + tidy data + geoms (R package) that are currently not supported by a stanfit object.

The idea is probably similar to the one discussed at Creating a brmsfit object with a modified brms-generated Stan model but I am not certain how to do that in practice, so any help would be much appreciated.

Additionally, I am not understanding how to implement in practice the second and third lines of this explanation:

#the brms model to be modified and updated

dt = read.table(header = TRUE, text = "
n r r/n group treat c2 c1 weights
62 3 0.048387097 1 0 0.1438 1.941115288 1.941115288
96 1 0.010416667 1 0 0.237 1.186583128 1.186583128
17 0 0 0 0 0.2774 1.159882668 3.159882668
41 2 0.048780488 1 0 0.2774 1.159882668 3.159882668
212 170 0.801886792 0 0 0.2093 1.133397521 1.133397521
143 21 0.146853147 1 1 0.1206 1.128993008 1.128993008
143 0 0 1 1 0.1707 1.128993008 2.128993008
143 33 0.230769231 0 1 0.0699 1.128993008 1.128993008
73 62 1.260273973 0 1 0.1351 1.121927228 1.121927228
73 17 0.232876712 0 1 0.1206 1.121927228 1.121927228")

m <-brm(r | trials(n) + weights(weights) ~ treat*c2+(1|group), 
              data=dt, family=binomial(link=logit))

#the modified brms-generated stan model modified_brms.stan

functions { 
} 
data { 
  int<lower=1> N;  // total number of observations 
  int Y[N];  // response variable 
  int trials[N];  // number of trials 
  real<lower=0> weights[N, 10];  // model weights 
  int<lower=1> K;  // number of population-level effects 
  matrix[N, K] X;  // population-level design matrix 
  // data for group-level effects of ID 1
  int<lower=1> J_1[N];
  int<lower=1> N_1;
  int<lower=1> M_1;
  vector[N] Z_1_1;
  int prior_only;  // should the likelihood be ignored? 
} 
transformed data { 
  int Kc = K - 1; 
  matrix[N, K - 1] Xc;  // centered version of X 
  vector[K - 1] means_X;  // column means of X before centering 
  for (i in 2:K) { 
    means_X[i - 1] = mean(X[, i]); 
    Xc[, i - 1] = X[, i] - means_X[i - 1]; 
  } 
} 
parameters { 
  vector[Kc] b;  // population-level effects 
  real temp_Intercept;  // temporary intercept 
  vector<lower=0>[M_1] sd_1;  // group-level standard deviations
  vector[N_1] z_1[M_1];  // unscaled group-level effects
} 
transformed parameters { 
  // group-level effects 
  vector[N_1] r_1_1 = sd_1[1] * (z_1[1]);
} 
model { 
  vector[N] mu = temp_Intercept + Xc * b;
  for (n in 1:N) { 
    mu[n] += r_1_1[J_1[n]] * Z_1_1[n];
  } 
  // priors including all constants 
  target += student_t_lpdf(temp_Intercept | 3, 0, 10); 
  target += student_t_lpdf(sd_1 | 3, 0, 10)
    - 1 * student_t_lccdf(0 | 3, 0, 10); 
  target += normal_lpdf(z_1[1] | 0, 1);
  // likelihood including all constants 
  if (!prior_only) { 
    for (n in 1:N) 
	for (w in 1:10) {
      target += weights[n, w] * binomial_logit_lpmf(Y[n] | trials[n], mu[n]);
    }
  } 
} 
generated quantities { 
  // actual population-level intercept 
  real b_Intercept = temp_Intercept - dot_product(means_X, b); 
}

#I am able to do this successfully, but not certain how to proceed from here:

mod_modified_brms_stan ← stan_model(‘modified_brms.stan’)

This question has also been posted here. Thanks in advance for any help.

What is the purpose of such multi column weights? at first glance, I don’t see what this could be sensible.

Thanks, @paul.buerkner. The purpose is to implement the concepts discussed here with brms. Essentially, it is to account for Uncertainty in the Design Stage.

From looking at the stan code it seems you can just sum over the columns per row to obtain the total weight of this observation, as the column index “w” of the weights does not appear in the LPDF/LPMF contribution of the model. If that’s true, you really need just a basic vector of weights.

Not certain if I am understanding correctly your explanation.

With this:

data { 
...
real<lower=0> weights[N, 10];  // data block of model weights 
} 

model { 
...
// likelihood 
for (n in 1:N) 
for (w in 1:10) {
target += weights[n, w] * binomial_logit_lpmf(Y[n] | trials[n], mu[n]);
}
} 

I am trying to simplify this (which is in fact what I want to do):

data { 
...
real<lower=0> w_1[N];       
real<lower=0> w_2[N];       
real<lower=0> w_3[N];      
real<lower=0> w_4[N];       
real<lower=0> w_5[N];       
real<lower=0> w_6[N];       
real<lower=0> w_7[N];       
real<lower=0> w_8[N];       
real<lower=0> w_9[N];       
real<lower=0> w_10[N]; 
} 

model { 
...
// likelihood 
for (i in 1:N){
  target += w_1[i] * binomial_logit_lpmf(Y[n] | trials[n], mu[n]);
  target += w_2[i] * binomial_logit_lpmf(Y[n] | trials[n], mu[n]);
  target += w_3[i] * binomial_logit_lpmf(Y[n] | trials[n], mu[n]);
  target += w_4[i] * binomial_logit_lpmf(Y[n] | trials[n], mu[n]);
  target += w_5[i] * binomial_logit_lpmf(Y[n] | trials[n], mu[n]);
  target += w_6[i] * binomial_logit_lpmf(Y[n] | trials[n], mu[n]);
  target += w_7[i] * binomial_logit_lpmf(Y[n] | trials[n], mu[n]);
  target += w_8[i] * binomial_logit_lpmf(Y[n] | trials[n], mu[n]);
  target += w_9[i] * binomial_logit_lpmf(Y[n] | trials[n], mu[n]);
  target += w_10[i] * binomial_logit_lpmf(Y[n] | trials[n], mu[n]);
}
}

So, if that is correct and I am getting right your explanation, what I want can be accomplished in brms by doing this:

N <- nrow(dt)
n <- dt$n
r <- dt$r
p <- dt$r/n
group <- dt$group
treat <- dt$treat
c1 <- dt$c1
c2 <- dt$c2
w_1 <- dt$weights
w_2 <- dt$weights - 0.01
w_3 <- dt$weights + 0.01/2
w_4 <- dt$weights - 0.01/3
w_5 <- dt$weights + 0.01/4
w_6 <- dt$weights - 0.01/5
w_7 <- dt$weights + 0.01/6
w_8 <- dt$weights + 0.01/7
w_9 <- dt$weights + 0.01/8
w_10 <- dt$weights + 0.01/9

w_sum <-(w_1+ w_2+ w_3+w_4+ w_5+ w_6+ w_7+ w_8+ w_9+ w_10)

then,

list_sum <- list (N = N, 
          n = n, r = r, p = p, group = group, treat = treat, c1 = c1, c2 = c2,
          weights = w_sum
          )

dt_bind <- as.data.frame(list_sum)

m <-brm(r | trials(n) + weights(weights) ~ treat*c2+(1|group), 
              data=dt_bind, family=binomial(link=logit))

Is that right?

Thanks for this.

I agree with Paul and would add that because the sum is the same as the average when the number of weight-samples is constant, summing the weights does not fulfill the purpose of incorporating uncertainty about weights in the estimation of the outcome model. (which is what Liao and Ziegler are describing).
To be honest, I don’t follow all the details in the paper*, but my intuition is that a “brute force” approach to implement uncertainty about the weights is to the estimation of the outcome model would be to

  • use the posterior predictive distribution of your outcome model to generate K sets weights
  • use the K sets of weights to fit the (Bayesian) outcome model
  • concatenate the K posterior samples to obtain a final posterior sample that reflects uncertainty of the selection and outcome models.

Depending on how large K is and/or how much data you have, this will take lots of time. In fact, for the models I am working with this approach is not feasible. So I incorporated uncertainty about the weights by hacking Stan to use a new set of weights (generated as described above) for each new iteration. The models converge (no divergent iterations and Rhat < 1.1), but there remain small problems with the Bayesian Fraction of Missing Information.

I’ll contact the authors of the paper you mention to get their opinion on the procedure I used.

  • I was working of this paper: Zigler, C. M. (2016). The Central Role of Bayes’ Theorem for Joint Estimation of Causal Effects and Propensity Scores. The American Statistician , 70 (1), 47–54.

PS: It might be that all you were missing to use your updated Stan model was the update command. Is this possible?

Thanks, @Guido_Biele, for this very detailed explanation. This is great and truly helpful.

How would the steps 2 and 3 in your explanation quoted below be implemented for example in my case above assuming that w_sum represents the K sets of weights generated using the posterior predictive model?

Additionally, how could I do this in my example above?

Stan facilitates step 1 using brms and stanarm so all I need is help with step 2 and 3 and the latter quote, I think. The authors have provided R code of their implementation based on MCMCpack.

Thanks in advance.

A related discussion is available at https://stackoverflow.com/questions/54233728/how-to-fit-and-combine-submodels-into-a-single-stanfit-object.

Because of The Fundamental Incompatibility of Hamiltonian Monte Carlo and Data Subsampling

@Bob_Carpenter suggested parallelizing the likelihood calculation, which is what I have been attempting to use in my code above to account for uncertainty in the design stage.

Thanks in advance for any help.

@Guido_Biele and @realkrantz — did either of you get a working brms implementation of this Liao and Zigler approach (particularly Guido’s hack of using new weights for every iteration)? I’m trying to implement this now and would love to see a working example. thanks!

I did this in basic Stan and used an ugly hack, where a custom c++ function read weights from a file during sampling. See here: GitHub - gbiele/IPW
However, I would now discourage from this using this approach!

1 Like