Analysis of two binomials

brophyj · May 8, 2024, 7:59pm

I have a RCT that was designed from a Bayesian perspective that I now want want to analyze. First with default vague prior which seems to work OK as the estimates correspond to 7/450 and 15/550

death_TC4 <- data.frame(N = c(450,550), y = c(7,15), grp2 = as.factor(c(0,1))) 
f = bf(y | trials(N) ~ 0 + grp2)
# vague prior
m1 <- brm(
  formula = f,
  data = death_TC4,
  family = binomial(link = "identity"),
  chains = 4, warmup = 1000, iter = 5000, seed = 123,
  refresh = 0
)
print(m1)

Now I want to include some prior information from another study about the efficacy of each drug

m2 <- brm(
  formula = f,
  data = death_TC4,
  family = binomial(link = "identity"),
  prior = c(prior(beta(35,880), class = b, coef = "grp20"),
            prior(beta(29,870),class = b, coef = "grp21")),
  chains = 4, warmup = 1000, iter = 5000, seed = 123,
  refresh = 0
)
print(m2)

This also seems a reasonable result. Finally I want to include a lot of prior information from a meta-analysis (yes, i realize that my study is a drop in the bucket and doesn’t change anything from the meta-analysis but bear with me). The meta-analysis shows decreased deaths with drug coded as 0 (320/1599) compared to drug 1 (125/480). However when I run the code the results seem completely strange to me.

m3<- brm(
  formula = bf(y | trials(N) ~ 0 + grp2),
  data = death_TC4,
  family = binomial(link = "identity"),
  prior = c(prior(beta(320,1279), class = b, coef = "grp20"),
            prior(beta(125,355),class = b, coef = "grp21")),
  chains = 4, warmup = 1000, iter = 5000, seed = 123,
  refresh = 0
)
print(m3)

Can someone explain to me what is going?

Solomon · May 9, 2024, 7:47pm

Your results are correct. When I ran m3 on my computer, the print() output was:

 Family: binomial 
  Links: mu = identity 
Formula: y | trials(N) ~ 0 + grp2 
   Data: death_TC4 (Number of observations: 2) 
  Draws: 4 chains, each with iter = 5000; warmup = 1000; thin = 1;
         total post-warmup draws = 16000

Regression Coefficients:
      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
grp20     0.16      0.01     0.14     0.18 1.00    14910    10378
grp21     0.14      0.01     0.12     0.16 1.00    11883     9665

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

Given binomial data, use of the identity link, and your beta priors, the posterior can be analytically defined as

\operatorname{beta}(z + a, N - z + b),

where z and N are your data, and a and b are the parameters from your prior. Thus, the analytic solution to your third model is

\text{Group 0}: \operatorname{beta}(7 + 320, 450 - 7 + 1279) = \operatorname{beta}(327, 1722), \\ \text{Group 1}: \operatorname{beta}(15 + 125, 550 - 15 + 355) = \operatorname{beta}(140, 890).

The mean of the beta distribution is a / (a + b). Here are the analytic means of your two posterior distributions.

327 / (327 + 1722)
140 / (140 + 890)

[1] 0.15959
[1] 0.1359223

Those values match up very nicely with the posterior means listed in the Estimate column of the print() output.

You can find the backgroud for this analytic approach in Chapter 6 of Kruschke’s text.

brophyj · May 9, 2024, 8:20pm

Thanks.
I had done this sort of checking on m1 and m2 but didn’t think about doing it on m3 since both the drug 0 prior risk (320/1599 =.2) and likelihood (7/450 = .01) are both lower than the corresponding values for drug 1, prior (125/480 = .26) and likelihood (15/550 =.027) , I found it hard to conceptualize why the posterior point estimates show a higher risk with drug 0.
Any thoughts? Some sort of Simpson’s paradox?

Solomon · May 9, 2024, 8:53pm

I don’t know that it’s an issue of Simpson’s Paradox. But since I’ve already brought up Chapter 6 of Kruschke’s text, I recommend making some versions of his Figure 6.3 by hand, but with your data examples instead. This should help clear things up. To get you started, here’s a quick version of Group 0 for m2.

# Load the packages
library(tidyverse)
library(patchwork)

# Make a function for the Bernoulli likelihood
dbern <- function(x, z, n) {
  return(x^z * (1 - x)^(n - z))
}

# Make the primary data frame for the plot
my_data <- data.frame(theta = seq(from = 0, to = 1, by = 0.001)) %>% 
  mutate(prior = dbeta(x = theta, shape1 = 35, shape2 = 880),
         likelihood = dbern(x = theta, z = 7, n = 450),
         posterior = dbeta(x = theta, shape1 = 7 + 35, shape2 = 450 - 7 + 880))

# Make/save the prior panel
p1 <- my_data %>% 
  ggplot(aes(x = theta, y = prior)) +
  geom_area() +
  scale_x_continuous(NULL, labels = NULL) +
  ylab(expression(italic(p)(theta))) +
  facet_wrap(~ "Prior")

# Make/save the likelihood panel
p2 <- my_data %>% 
  ggplot(aes(x = theta, y = likelihood)) +
  geom_area() +
  scale_x_continuous(NULL, labels = NULL) +
  scale_y_continuous(expression(italic(p)(D*'|'*theta)), breaks = 0:2 / 1e16) +
  facet_wrap(~ "Likelihood")

# Make/save the posterior panel
p3 <- my_data %>% 
  ggplot(aes(x = theta, y = posterior)) +
  geom_area() +
  scale_x_continuous(expression(theta)) +
  ylab(expression(italic(p)(theta*'|'*D))) +
  facet_wrap(~ "Posterior")

# Combine and entitle the panels with {patchwork} syntax
p1 / p2 / p3 +
  plot_annotation(title = "Group 0, model m2")

Solomon · May 9, 2024, 9:05pm

If it’s too hard to see what’s going on the plot because all the shapes are so scrunched over to the left of the \theta space, try something like this instead:

(p1 / p2 / p3) &
  coord_cartesian(xlim = c(0, 0.2)) &
  plot_annotation(title = "Group 0, model m2")

brophyj · May 9, 2024, 9:12pm

I will pursue this along the lines you suggest using my data, some of your code, and a re-read of Chapter 6.
Thanks again for your time.

Topic		Replies	Views
Brms iformative prior for bionomial regression brms	21	1249	October 8, 2024
Help with understanding of bayesian inference using GLM and BRMS brms	3	1404	November 13, 2018
Could you please check if I specify the model by brms correctly? brms	6	664	October 15, 2019
Selecting priors for a binary categorical predictor in test of proportions in brms Modeling prior-choice , posterior-predictive , priors , binomial-response , brms	6	2255	March 22, 2021
Post-hoc test brms brms	5	2848	May 21, 2022

Analysis of two binomials

Related topics