Mixture of Binomial Question

PhDemetri · January 20, 2021, 5:43am

Background

I want to model a scenario which gives rise to biased coins. Say I pull a coin out of a bag which contains a mix of biased and fair coins. The proportion of fair to biased is unknown to me. The bias of biased coins is allowed to vary, and the average bias is again unknown to me.

Say I draw 100 coins and flip them each 100 times. I’d like to hierarchically model this data and I think the likelihood is a binomial mixture.

Here is my shot at a model. In the model, the probability of the coin being biased has a uniform probability, and the probability of heads for biased coins is drawn from a beta distribution.

data{
    int n;
    int heads[n];
  }
  parameters{
    real<lower=0, upper=1> prob_of_bias;
    vector<lower = 0, upper = 1>[n] b; 
    
    real<lower=0, upper=1> mu;
    real<lower=0> kappa;
  }
  model{
    prob_of_bias ~ beta(1,1);
    mu ~ beta(1,1);
    kappa ~ cauchy(0, 1);
    b ~ beta_proportion(mu,kappa);
    for (i in 1:n){
    target+=log_mix(prob_of_bias,
                     binomial_lpmf(heads[i] | 100, b[i]),
                     binomial_lpmf(heads[i] | 100,  0.5));
  }
  }

Here is some code to simulate the process:

library(tidyverse)
library(tidybayes)
library(cmdstanr)
library(posterior)

n = 100
is_biased = rbinom(n, 1 , 0.77)
b = 0.5 + is_biased*(rbeta(n, 90, 10) - 0.5)
x = rbinom(n, 100, p)
d = tibble(coin = factor(1:n), heads = x)

Question

Is a mixture of binomials the right idea, and if it is, have I approproately implemented it in Stan? Additionally, how might I go about estimating the probability that any one of the particular coins is biased? I remember watching a video that Andrew Gelman made a while ago in which he used mixtures for something related to climate change (well…it was a bet about something related to climate change). In that video, I think the probability of belonging to one mixture was computed using an approach like…

generated quantities{
  matrix[n,2] pn;
  matrix[n,2] ps;
  for (i in 1:n){
  pn[i,1] = log_mix(prob_of_bias,
                     binomial_lpmf(heads[i] | 100, b[i]),
                     binomial_lpmf(heads[i] | 100,  0.5));
                     
  pn[i,2] = log_mix(1 - prob_of_bias,
                     binomial_lpmf(heads[i] | 100, b[i]),
                     binomial_lpmf(heads[i] | 100,  0.5));
                     
  ps[i,] = pn[i,]/sum(pn[i,]);
  }
  }

However, this approach doesn’t seem to work (the model can not discriminate between being biased and not being biased).

Happy to ad more details if necessary.

nhuurre · January 20, 2021, 12:57pm

Looks right, yes.
Alternatively, you could use beta_binomial_lpmf so you don’t need intermediate b parameter. That may be a little faster.

Don’t log_mix the cases, just compute each posterior probability separately. Also, pn in your code is the logarithm of (unnormalized) probability. It needs to be exponentiated. The softmax function handles both exponentiation and normalization in one step.

generated quantities{
  matrix[n,2] ps;
  for (i in 1:n) {
    vector[2] pn;
    // log-probability that there is bias
    pn[1] = log(prob_of_bias) + binomial_lpmf(heads[i] | 100, b[i]);
    // log-probability that there is no bias
    pn[2] = log1m(prob_of_bias) + binomial_lpmf(heads[i] | 100,  0.5));
    // posterior probabilities for bias and no bias
    ps[i,] = softmax(pn);
  }
}

PhDemetri · January 20, 2021, 2:53pm

Appreciated! Thanks for the input

Topic		Replies	Views
Marginalize over denominators in mixture of binomials Modeling	3	660	February 11, 2020
Binomial Mixtures with a Mixture of Dirichlet Process Prior Modeling	2	703	April 7, 2022
Continuous Binomial: further questions Modeling	3	423	September 18, 2023
Beta binomial mixture model Modeling mixture	5	1305	March 4, 2021
Fitting a discrete integer mixture model within the Stan paradigm Modeling	10	2405	May 11, 2017

Mixture of Binomial Question

Background

Question

Related topics