Poisson-binomial distribution - any existing Stan implementation?

richard_mcelreath · May 16, 2018, 6:59pm

I came across a neat modeling problem earlier today that actually is solved using the Poisson-binomial distribution.

Before I write a new Stan function for it, I thought I’d ask whether anyone here has already done so. The n is pretty small in my use case (n=10 maximum), so the exact formula should work fine. My instinct is that it’s just a matter of generating all the combinations to loop over.

richard_mcelreath · May 17, 2018, 2:00pm

I made a quick Stan implementation of the Poisson binomial probability distribution. I tested it against the R package poisbinom. Seems to work.

This is not elegant. I started with a recursive algorithm for the combinations, but couldn’t seem to get it to work. So for sake of time, I fell back on a lexicographic strategy (like the R function combn). Only half-way through, I realized I needed only the bits, and should have just done a bit list. So this could be more efficient, but that shouldn’t matter much (I hope) unless n is large.

Here’s the code for RStan, with test at bottom:

mc <- '
functions{
    // need a function to compute poissonbinomial log-probability
    //   log-prob of k successes from n items 
    //   where prob of success of the n items is given by vector pp
    // this mainly requires looping over combinations of n-choose-k
    // for each combination, log-prob is straight-forward, given vector of probs pp

    // function for log-prob of each term, given combination of size k
    // called by lpmf below
    real lppoibin( int k, int n, int[] elements , vector pp ) {
        real lp;
        lp = 0;
        //print( elements );

        // for each possible element, accumulate probability
        for ( i in 1:n ) {
            int bang;
            bang = 0;
            for ( j in 1:k ) if ( elements[j]==i ) bang = 1;
            if ( bang==1 ) {
                // term in elements
                lp = lp + log(pp[i]);
            } else {
                // term not in elements
                lp = lp + log1m(pp[i]);
            }
        }

        return lp;
    }

    real poissonbinomial_lpmf( int k, int n, vector pp ) {
        int x[n];
        int a[k];
        int e;
        int h;
        int r[k];
        vector[ choose( n , k ) ] terms; // holds each log-prob

        for ( i in 1:n ) x[i] = i;
        for ( i in 1:k ) a[i] = i;

        // first sequence
        r = a;
        terms[1] = lppoibin( k , n , r , pp );

        // loop to lexicographically generate remaining
        if ( k > 0 ) {
            int i;
            int j;
            int nmmp1;
            i = 2;
            nmmp1 = n - k + 1;
            while ( a[1] != nmmp1 ) {
                if ( e < n - h ) {
                    h = 1;
                    e = a[k];
                    j = 1;
                    a[ k - h + j ] = e + j;
                } else {
                    e = a[k-h];
                    h = h + 1;
                    j = h;
                    for ( ii in 1:j ) a[ k - h + ii ] = e + ii;
                }
                for ( ii in 1:k ) r[ii] = x[a[ii]];
                terms[i] = lppoibin( k , n , r , pp );
                i = i + 1;
            }
        }
        return log_sum_exp(terms);
    }
}
'

cppcode <- stanc( model_code = mc , model_name = "testccp" )
expose_stan_functions( cppcode )

k <- 5
n <- 20
pp <- runif( n )
poissonbinomial_lpmf( k , n , pp )

library(poisbinom)
dpoisbinom( k , pp , log=TRUE )

jonah · May 17, 2018, 10:06pm

I quickly tried the recursive version once, maybe a year ago, and couldn’t get it to work right. This is definitely a better approach. You’re right that there’s some potential for speeding it up, but that wouldn’t extend the tractable values of n significantly. Unless there’s a clever trick to reduce the complexity that I’m not aware of (I’m not aware of most things).

Bob_Carpenter · May 21, 2018, 11:56pm

Summing over subsets is pretty much the definition of “intractable” in computer science. But it also hints at the solution—dynamic programming.

The elements p[i] are already ordered, so the dynamic programming cells will be (i, n), where n is the number of successes. So you just need a dynamic programming algorithm like the forward algorithm for HMMs. It needs to deal with the constraint that you need to get the right number of successes. The transitions are linear rather than quadratic as you can only go two ways from a state—adding a success or failure.

I think the complexity should be quadratic in the size of p, i.e., \mathcal{O}(n^2).

This is like the TopCoder hard tier problems I used to practice on.

Bob_Carpenter · May 22, 2018, 1:13am

Here’s an R version of the dynamic programming algorithm on the linear scale. To make it stable, you’d need to convert it to log-sum-exp.

The dynamic programming variable alpha[n + 1, tot + 1] is the probability that the first n bernoulli trials yielded tot successes. I had to add one to accomodate the base case of the algorithm (n = 0) and zero total successes (tot = 0). It makes everything a bit more challenging to read.

I tried to make the code as clear as possible by heavy use of local variables. n corresponds to the position in theta and tot to the total number of successes observed (so far). It works left to right.

It’s easy to see that it’s quadratic and computes the answer for every possible output. So this can also compute the cdf and ccdf by just summing that final column.

# Return vector of probabilities for all outcomes in increasing order of count.
# theta : vector of probability values in (0, 1)
poisson_binomial <- function(theta) {
  N <- length(theta)
  if (N == 0) return(c(1));
  alpha <- matrix(-Inf, N + 1, N + 1);
  alpha[1, 1] <- 1;
  for (n in 1:N) {
    tot = 0;
    alpha[n + 1, tot + 1] = alpha[n, tot + 1] * (1 - theta[n]);

    if (n > 1) {
      for (tot in 1:(n - 1)) {
        alpha[n + 1, tot + 1] = 
            alpha[n, tot] * theta[n] + alpha[n, tot  + 1] * (1 - theta[n]);
      }
    }

    tot = n;
    alpha[n + 1, tot + 1] = alpha[n, tot] * theta[n];
  }
  return(alpha[N + 1, 1:(N + 1)]);
}

As far as I can tell, it returns the right values:

> poisson_binomial(c())
[1] 1

> poisson_binomial(c(0.6))
[1] 0.4 0.6

> poisson_binomial(c(0.3, 0.6))
[1] 0.28 0.54 0.18

> poisson_binomial(rep(0.5, 10))
 [1] 0.00098 0.00977 0.04395 0.11719 0.20508 0.24609 0.20508 0.11719 0.04395 0.00977 0.00098

> for (n in 0:10) print(dbinom(n, 10, 0.5))
[1] 0.00098
[1] 0.0098
[1] 0.044
[1] 0.12
[1] 0.21
[1] 0.25
[1] 0.21
[1] 0.12
[1] 0.044
[1] 0.0098
[1] 0.00098

Bob_Carpenter · May 22, 2018, 1:42am

Here’s a recursive version. The boundary conditions are somehow trickier than I was expecting:

dpois_binom <- function(y, theta) {
  N <- length(theta);
  if (N == 0) return(ifelse(y == 0, 1, 0));
  if (N == 1) return(ifelse(y == 1, theta[1], 1 - theta[1]));
  if (y == 0) return(prod(1 - theta));
  # y >= 1, N >= 2
  tail <- theta[2:N];
  return((1 - theta[1]) * dpois_binom(y, tail) + theta[1] * dpois_binom(y - 1, tail));
}

This one just calculates a single value. But it’s still exponential in max(y, size(theta)).

Bob_Carpenter · May 22, 2018, 3:07am

Here’s a more efficient Stan version that

works on the log scale for arithmetic stability
is quadratic in the size of the vector of probabilities
avoids doing (most) unnecessary calculations
only returns a single value

It could be easily modified to produce an efficient lcdf or lccdf.

functions {

  /**
   * Return the Poisson-binomial log probability mass for the specified
   * count y and vector of probabilities theta.  The result is the log
   * probability of y successes in N = size(theta) independent
   * trials with probabilities of success theta[1], ..., theta[N].
   *
   * See:  https://en.wikipedia.org/wiki/Poisson_binomial_distribution
   *
   * @param y number of successes
   * @param theta vector of probabilities
   * @return Poisson-binomial log probability mass
   */
  real poisson_binomial_lpmf(int y, vector theta) {
    int N = rows(theta);
    matrix[N + 1, N + 1] alpha;
    vector[N] log_theta = log(theta);
    vector[N] log1m_theta = log1m(theta);

    if (y < 0 || y > N)
      reject("poisson binomial variate y out of range, y = ", y,
             " N = ", N);
    for (n in 1:N)
      if (theta[n] < 0 || theta[n] > 1)
        reject("poisson binomial parameter out of range,",
               " theta[", n, "] =", theta[n]);

    if (N == 0)
      return y == 0 ? 0 : negative_infinity();

    // dynamic programming with cells
    // alpha[n + 1, tot + 1] = log prob of tot successes in first n trials
    alpha[1, 1] = 0;
    for (n in 1:N) {
      // tot = 0
      alpha[n + 1, 1] = alpha[n, 1] + log1m_theta[n];

      // 0 < tot < n
      for (tot in 1:min(y, n - 1))
        alpha[n + 1, tot + 1]
            = log_sum_exp(alpha[n, tot] + log_theta[n],
                          alpha[n, tot  + 1] + log1m_theta[n]);

      // tot = n
      if (n > y) continue;
      alpha[n + 1, n + 1] = alpha[n, n] + log_theta[n];
    }
    return alpha[N + 1, y + 1];
  }

}
transformed data {
  print("*************************************");

  print("poisson_binomial_lpmf(0 | []) = ",
        exp(poisson_binomial_lpmf(0 | rep_vector(1, 0))));

  print("poisson_binomial_lpmf(0 | [0.3]') = ",
        exp(poisson_binomial_lpmf(0 | [0.3]')));
  print("poisson_binomial_lpmf(1 | [0.3]') = ",
        exp(poisson_binomial_lpmf(1 | [0.3]')));

  print("poisson_binomial_lpmf(0 | [0.3, 0.6]') = ",
        exp(poisson_binomial_lpmf(0 | [0.3, 0.6]')));
  print("poisson_binomial_lpmf(1 | [0.3, 0.6]') = ",
        exp(poisson_binomial_lpmf(1 | [0.3, 0.6]')));
  print("poisson_binomial_lpmf(2 | [0.3, 0.6]') = ",
        exp(poisson_binomial_lpmf(2 | [0.3, 0.6]')));

  for (n in 0:10)
    print("poisson_binomial_lpmf(", n, " | 0.5) = ",
          exp(poisson_binomial_lpmf(n | rep_vector(0.5, 10))));

  print("*************************************");
}
parameters {
  real<lower = 0, upper = 1> z;
}

Here’s what it prints (which is wrong—it’s printing the exponentiation of those):

poisson_binomial_lpmf(0 | []) = 1
poisson_binomial_lpmf(0 | [0.3]') = 0.7
poisson_binomial_lpmf(1 | [0.3]') = 0.3
poisson_binomial_lpmf(0 | [0.3, 0.6]') = 0.28
poisson_binomial_lpmf(1 | [0.3, 0.6]') = 0.54
poisson_binomial_lpmf(2 | [0.3, 0.6]') = 0.18
poisson_binomial_lpmf(0 | 0.5) = 0.000976562
poisson_binomial_lpmf(1 | 0.5) = 0.00976563
poisson_binomial_lpmf(2 | 0.5) = 0.0439453
poisson_binomial_lpmf(3 | 0.5) = 0.117188
poisson_binomial_lpmf(4 | 0.5) = 0.205078
poisson_binomial_lpmf(5 | 0.5) = 0.246094
poisson_binomial_lpmf(6 | 0.5) = 0.205078
poisson_binomial_lpmf(7 | 0.5) = 0.117188
poisson_binomial_lpmf(8 | 0.5) = 0.0439453
poisson_binomial_lpmf(9 | 0.5) = 0.00976563
poisson_binomial_lpmf(10 | 0.5) = 0.000976562

So it looks OK. I didn’t test the error handling.

If you have a lot of y with the same theta, it would help to just build this whole array once with the max of the y as the bound. If that’s too opaque, I can help code that, too. This should be efficient, even for large N.

richard_mcelreath · May 22, 2018, 7:02am

This is awesome, Bob. I will try to find some time this week to merge this into my model implementation and compare timings. I have a working analysis with my clunky algorithm, but it doesn’t scale well at all. So your algorithm will be a big help.

Bob_Carpenter · May 22, 2018, 4:37pm

Do you have a lot of observed counts y[n] for the same theta or do the theta vary with n? If so, the vectorized form can be made much more efficient compared to just calling this multiple times. The whole dynamic programming array only needs to get built for the largest y.

I created a stan-dev/math issue for the Poisson-binomial.

richard_mcelreath · May 23, 2018, 7:27am

Right now, many have same theta. I’ve quasi-vectorized my implementation by caching the log-probabilities and reusing them in the model block. Once I start adding more covariates, it’s going to get much slower, with lots of unique thetas.

jonah · May 23, 2018, 10:00pm

Indeed, thanks @Bob_Carpenter!

Mike_Strub · March 12, 2019, 8:25am

Here is a simple function to calculate the Poisson binomial PMF.

N <- 5
P <- c(0.1,0.2,0.3,0.4,0.5)

PMF.1 <- function(N,P) {
  A <- rep(0,(N+2))
  A[2] <- 1
  for (j in 1:N) for (i in (j+2):2) A[i] <- A[i]+(A[i-1]-A[i])*P[j]
  return(A[2:(N+2)])
}

PMF.1(N,P)
sum(PMF.1(N,P))

Topic		Replies	Views
Poisson binomial model -- advice needed on modelling approach Modeling finance	3	1060	February 24, 2025
Marginalizing a double binomial-Poisson hierarchical distribution Modeling stan-math , ecology	7	968	June 16, 2022
Compound Sampling-Binomial? Modeling	14	1233	August 7, 2021
A more efficient implementation of the sum of independent binomials? Modeling	16	1337	February 4, 2018
Modeling problem when N obeys Poisson distribution in binomial distribution Modeling hierarchical-model	9	852	September 30, 2021

Poisson-binomial distribution - any existing Stan implementation?

Related topics