Confluent hypergeometric functions to estimate population size and detection probability from spatially replicated counts

nikunj410 · September 6, 2022, 11:21pm

I am modeling spatial population dynamics which consist of two layers. The inner layer is the population abundance of the species which is modeled using an a spatial population dynamics model. The outer layer is the observation process.

Here is the brief description of the model

Assuming count data y_{x,t,j}\sim \text{Binomial}(n_{x,t,j}, \theta), were x is the patch location, t is time, j is the replicate sampling ID, and \theta is the detection probability. We assume \theta \sim \text{Beta}(\alpha , \beta) and n_{x,t,j} \sim \text{Poisson} (\lambda _{x,t}), where \lambda _{x,t} comes from the population dynamics model e.g. \lambda _{x,t+1} = \int^{\infty}_{- \infty} k(x-y) f(\lambda _{y,t}) dy.

Royle (2004) suggests that if we have replicate counts, we can estimate p, n_{x,t,j}, and \lambda _{x,t}. However, this problem cannot be coded directly in Stan because n_{x,t,j} is a discrete parameter and, therefore, has to be marginalized out. Mathematically, we can marginalize \theta and n_{x,t,j} in the following manner

p(Y= y_{x,t,j} | \alpha, \beta, \lambda_{x,t}) = \sum_{n_{x,t,j} =0}^{\infty} \int_{0}^{1} \underbrace{p(y_{x,t,j} | \theta, n_{x,t,j} )}_\text{Binomial dist.} \underbrace{p(\theta | \alpha, \beta)}_\text{Beta dist.} \underbrace{p( n_{x,t,j}| \lambda_{x,t} )}_\text{Poisson dist.} d\theta.

This gives us
p(Y= y_{x,t,j} | \alpha, \beta, \lambda_{x,t}) = \text{Poisson}(y_{x,t,j} |\lambda_{x,t}) \frac{\text{Beta}(\alpha + y_{x,t,j}, \beta)}{\text{Beta}(\alpha, \beta)} \ce{_{1}F_{1}} (\beta, \alpha + \beta + y_{x,t,j}, \lambda_{x,t}),

where \ce{_{1}F_{1}} is the confluent hypergeometric function of the first kind. We have successfully marginalized the \theta and n_{x,t,j}, which can be recovered in the generated quantities block. Additionally, there might be computational advantages in terms of both accuracy and speed (not tested yet).

I am stuck trying to evaluate \ce{_{1}F_{1}}. Currently Stan does not support confluent hypergeometric functions. There are several infinite summation and integral methods to estimate \ce{_{1}F_{1}} but they may not be stable and scalable. Is the Stan development team working on implementing a scalable and reliable method/function to calculate confluent hypergeometric functions anytime soon?

I believe that the technique I presented in this post could be very useful in patch occupancy modeling.

spinkney · September 7, 2022, 12:30am

Hypergeometric functions are in Stan math and should be exposed in the Stan language with the next release. See math/hypergeometric_pFq.hpp at develop · stan-dev/math · GitHub. @andrjohns implemented this and will know more.

The interesting parts are in the derivative s that he implemented at math/grad_pFq.hpp at develop · stan-dev/math · GitHub.

maxbiostat · September 7, 2022, 12:33am

@GuidoAMoreira and I have some Stan code to do adaptive truncation, described in detail in our preprint. @jsocolar, this might be the push we need to get this implemented for occupancy models, uh? Happy to collaborate!

andrjohns · September 7, 2022, 8:21am

I’m currently working on the confluent 1F1 and its gradients over on this math branch (since it’s the foundation of a bunch of numerical stability improvements for the log-CDFs).

So this should be available for you in the next release

Matthew_Spencer1 · September 8, 2022, 10:21pm

I had a similar problem a couple of months ago:
https://discourse.mc-stan.org/t/marginalizing-a-double-binomial-poisson-hierarchical-distribution/27801
I had just two replicate counts, although it sounds as though you may have more. I didn’t marginalize out the detection probabilities, which may affect whether my approach is relevant to you. I initially looked at ways of evaluating the confluent hypergeometric function, including the adaptive truncation suggested by @maxbiostat, but in the end found that there was a relatively easy exact solution if you work with the probability generating functions. There is Stan code at the end of the post in the link above (I may have made a few minor changes since then, but that version was working). I recommend first having a look at section 36.8 in Johnson, Kotz and Balakrishnan, Discrete Multivariate Distributions, which is quite general, and then Hamdan and Tsokos (1971), which is less general but a bit easier to follow.

nikunj410 · February 7, 2023, 7:16am

Hi @andrjohns, was 1F1 release in the last version of stan? I could not find it in the function manual.

nikunj410 · October 20, 2023, 6:20pm

I wrote a Stan function to calculate 1F1 on log scale based on this paper. Posting it here in case it may be useful to someone. This may become obsolete once pFq are implemented in Stan.

  real Log_1F1(real a, real b, real z) {
    int j = 0;
    real toll = 1e-9;
    real log_Aj = 0;
    real log_Sj = log_Aj;
    
    // z > 0 ? Log_1F1(a, b, z) : z + Log_1F1(b-a, b, -z) (for numericall accuracy)
    
    if (z >= 0){
      while (log_Aj-log_Sj > log(toll)){
        j += 1;
        log_Aj += log(((a+j-1)/(b+j-1))*(z/j));
        log_Sj = log_sum_exp([log_Aj, log_Sj]);}
      return log_Sj;
    } else {
        while (log_Aj-log_Sj > log(toll)){
          j += 1;
          log_Aj += log(((b-a+j-1)/(b+j-1))*(-z/j));
          log_Sj = log_sum_exp([log_Aj, log_Sj]);}
        return z + log_Sj;
    }
  }

luchinoprince · November 29, 2023, 4:05pm

So to this date there is no native function right? I could not find it ether and would like to test a model which needs it.

nikunj410 · November 29, 2023, 7:29pm

@luchinoprince I dont think there is one but it seems to be under works. I also implemented pFq function.
Here is the code that I wrote. I have only tested this for a_p, b_q and Re(z) > 0

  real log_pFq(vector a_p, vector b_q, real z) {
    int j = 0;
    real toll = log(machine_precision());
    real log_Aj = 0;
    real log_Sj = log_Aj;
    real log_z = log(z);
    
    while (log_Aj-log_Sj > toll && j< 1e6){
      j += 1;
      log_Aj += sum(log(a_p+j-1))-sum(log(b_q+j-1))+log_z-log(j);
      log_Sj = log_sum_exp([log_Aj, log_Sj]);}
    return log_Sj;
  }

luchinoprince · November 29, 2023, 7:42pm

@nikunj410 thanks for the answer, I saw you implementation. I am just concerned about the speed convergence of such an implementation and its potential instability. Also not sure how the gradient would work if the pdf is an iterative function. In my case I have a tensor where the entries would be a density which uses pFq, hence I would have to calculate a lot of those at every HMC step, which I guess would be too costly. Maybe parallelising within chain could help, I will try to test it,

nikunj410 · November 29, 2023, 8:13pm

@luchinoprince I am pretty sure that this is not a stable implementation when a,b,z <0. For, example, when I tried implementing 1F1 for z<0, I was getting convergence issues. For positive coefficient values, I was able to achieve similar numerical accuracy as the pFq function in Julia (because I was working on log scale).

You are right the speed is a major limiting factor. Since there are no analytical gradients, we have to rely on auto diff which is usually slower. But the work on gradients is stills in progress here.

Maybe you want to try and implement the code I shared to see if your Stan code works with synthetic data and when the native implementation of pFq is made available in Stan, you can make appropriate changes.

luchinoprince · November 30, 2023, 2:35pm

hey @nikunj410 , I am sure that in my case a,b would be greater than 0, while z could have ether sign(not sure). I will see if I can find other distributions which allow a better treatment, as the main issue with approximate likelihood from my perspective is that it is hard to justify/specify from what distribution we are then sampling from.

Best,
Luca

Topic		Replies	Views
Marginalizing a double binomial-Poisson hierarchical distribution Modeling stan-math , ecology	7	877	June 16, 2022
hypergeometric_pFq General	2	345	December 3, 2023
How to use stan::math::hypergeometric_2F1 in stan? Modeling	2	256	February 13, 2024
Extended Hypergeometric Distribution in Stan Modeling specification	11	721	May 30, 2022
Fitting a non-central chisquare distribution Modeling rstan , specification	2	689	November 22, 2021

Confluent hypergeometric functions to estimate population size and detection probability from spatially replicated counts

Related topics