Analysis of two binomials

I have a RCT that was designed from a Bayesian perspective that I now want want to analyze. First with default vague prior which seems to work OK as the estimates correspond to 7/450 and 15/550

death_TC4 <- data.frame(N = c(450,550), y = c(7,15), grp2 = as.factor(c(0,1))) 
f = bf(y | trials(N) ~ 0 + grp2)
# vague prior
m1 <- brm(
  formula = f,
  data = death_TC4,
  family = binomial(link = "identity"),
  chains = 4, warmup = 1000, iter = 5000, seed = 123,
  refresh = 0
)
print(m1)

Now I want to include some prior information from another study about the efficacy of each drug

m2 <- brm(
  formula = f,
  data = death_TC4,
  family = binomial(link = "identity"),
  prior = c(prior(beta(35,880), class = b, coef = "grp20"),
            prior(beta(29,870),class = b, coef = "grp21")),
  chains = 4, warmup = 1000, iter = 5000, seed = 123,
  refresh = 0
)
print(m2)

This also seems a reasonable result. Finally I want to include a lot of prior information from a meta-analysis (yes, i realize that my study is a drop in the bucket and doesn’t change anything from the meta-analysis but bear with me). The meta-analysis shows decreased deaths with drug coded as 0 (320/1599) compared to drug 1 (125/480). However when I run the code the results seem completely strange to me.

m3<- brm(
  formula = bf(y | trials(N) ~ 0 + grp2),
  data = death_TC4,
  family = binomial(link = "identity"),
  prior = c(prior(beta(320,1279), class = b, coef = "grp20"),
            prior(beta(125,355),class = b, coef = "grp21")),
  chains = 4, warmup = 1000, iter = 5000, seed = 123,
  refresh = 0
)
print(m3)

Can someone explain to me what is going?

Your results are correct. When I ran m3 on my computer, the print() output was:

 Family: binomial 
  Links: mu = identity 
Formula: y | trials(N) ~ 0 + grp2 
   Data: death_TC4 (Number of observations: 2) 
  Draws: 4 chains, each with iter = 5000; warmup = 1000; thin = 1;
         total post-warmup draws = 16000

Regression Coefficients:
      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
grp20     0.16      0.01     0.14     0.18 1.00    14910    10378
grp21     0.14      0.01     0.12     0.16 1.00    11883     9665

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

Given binomial data, use of the identity link, and your beta priors, the posterior can be analytically defined as

\operatorname{beta}(z + a, N - z + b),

where z and N are your data, and a and b are the parameters from your prior. Thus, the analytic solution to your third model is

\text{Group 0}: \operatorname{beta}(7 + 320, 450 - 7 + 1279) = \operatorname{beta}(327, 1722), \\ \text{Group 1}: \operatorname{beta}(15 + 125, 550 - 15 + 355) = \operatorname{beta}(140, 890).

The mean of the beta distribution is a / (a + b). Here are the analytic means of your two posterior distributions.

327 / (327 + 1722)
140 / (140 + 890)
[1] 0.15959
[1] 0.1359223

Those values match up very nicely with the posterior means listed in the Estimate column of the print() output.

You can find the backgroud for this analytic approach in Chapter 6 of Kruschke’s text.

1 Like

Thanks.
I had done this sort of checking on m1 and m2 but didn’t think about doing it on m3 since both the drug 0 prior risk (320/1599 =.2) and likelihood (7/450 = .01) are both lower than the corresponding values for drug 1, prior (125/480 = .26) and likelihood (15/550 =.027) , I found it hard to conceptualize why the posterior point estimates show a higher risk with drug 0.
Any thoughts? Some sort of Simpson’s paradox?

I don’t know that it’s an issue of Simpson’s Paradox. But since I’ve already brought up Chapter 6 of Kruschke’s text, I recommend making some versions of his Figure 6.3 by hand, but with your data examples instead. This should help clear things up. To get you started, here’s a quick version of Group 0 for m2.

# Load the packages
library(tidyverse)
library(patchwork)

# Make a function for the Bernoulli likelihood
dbern <- function(x, z, n) {
  return(x^z * (1 - x)^(n - z))
}

# Make the primary data frame for the plot
my_data <- data.frame(theta = seq(from = 0, to = 1, by = 0.001)) %>% 
  mutate(prior = dbeta(x = theta, shape1 = 35, shape2 = 880),
         likelihood = dbern(x = theta, z = 7, n = 450),
         posterior = dbeta(x = theta, shape1 = 7 + 35, shape2 = 450 - 7 + 880))

# Make/save the prior panel
p1 <- my_data %>% 
  ggplot(aes(x = theta, y = prior)) +
  geom_area() +
  scale_x_continuous(NULL, labels = NULL) +
  ylab(expression(italic(p)(theta))) +
  facet_wrap(~ "Prior")

# Make/save the likelihood panel
p2 <- my_data %>% 
  ggplot(aes(x = theta, y = likelihood)) +
  geom_area() +
  scale_x_continuous(NULL, labels = NULL) +
  scale_y_continuous(expression(italic(p)(D*'|'*theta)), breaks = 0:2 / 1e16) +
  facet_wrap(~ "Likelihood")

# Make/save the posterior panel
p3 <- my_data %>% 
  ggplot(aes(x = theta, y = posterior)) +
  geom_area() +
  scale_x_continuous(expression(theta)) +
  ylab(expression(italic(p)(theta*'|'*D))) +
  facet_wrap(~ "Posterior")

# Combine and entitle the panels with {patchwork} syntax
p1 / p2 / p3 +
  plot_annotation(title = "Group 0, model m2")

1 Like

If it’s too hard to see what’s going on the plot because all the shapes are so scrunched over to the left of the \theta space, try something like this instead:

(p1 / p2 / p3) &
  coord_cartesian(xlim = c(0, 0.2)) &
  plot_annotation(title = "Group 0, model m2")
1 Like

I will pursue this along the lines you suggest using my data, some of your code, and a re-read of Chapter 6.
Thanks again for your time.

1 Like