Is it possible to speed up this poisson_inverse_gaussian

stemangiola · November 12, 2018, 4:09am

I am exploring long-tail alternatives to Negative binomial, one of which is piossin_inverse_gaussian.

The probability can be calculated recursively

https://www.tandfonline.com/doi/abs/10.1080/19439962.2014.977502

I transformed the calculation to a linear one using “pre-calculated” vector (the function have been tested as producing right results compared to the original implementation)


functions{
real poisson_inverse_gaussian_lpmf(int y, real log_mu, real tau){


	real tau_mu_2_log = log(tau) + log_mu + 0.6931472; // log(2)
	real tau_mu_2_1_log =  log(exp(tau_mu_2_log) + 1);
	real tau_mu_2_1_sqrt_log = 1.0/2 * tau_mu_2_1_log;
	vector[y + 1] p_arr_log;

	// Start to create array
	p_arr_log[1] = 1.0/tau * (1-exp(tau_mu_2_1_sqrt_log));

	if(y>0)	p_arr_log[2] = log_mu - tau_mu_2_1_sqrt_log + p_arr_log[1];

	if(y>1) {
	  real tau_mu_2_log_tau_mu_2_1_log = tau_mu_2_log - tau_mu_2_1_log;
	  real two_log_mu_tau_mu_2_1_log = 2* log_mu - tau_mu_2_1_log;

	  for(y_dot in 2:y) {



  		p_arr_log[y_dot + 1] =
  			log_sum_exp(
  				 tau_mu_2_log_tau_mu_2_1_log + log(1-3.0/(2*y_dot)) + p_arr_log[y_dot],
  				 two_log_mu_tau_mu_2_1_log - log(y_dot) - log(y_dot-1) + p_arr_log[y_dot-1]
  			);
	  }
	}

	return p_arr_log[ y + 1 ];

}


}
data {
  int<lower=0> N;
  int<lower=0> G;
  int<lower=0> counts[N,G];
  real my_prior[2];
  int<lower=0, upper=1> omit_data;

  // Alternative models
  int<lower=0, upper=1> is_prior_asymetric;
}
parameters {

  // Overall properties of the data
  real<lower=0> lambda_mu; // So is compatible with logGamma prior
  real<lower=0> lambda_sigma_raw;
  //real<lower=0> sigma;
  real<lower=0> tau;
  //real exposure_rate[N];

  // Gene-wise properties of the data
  vector[G] lambda_raw;


}
transformed parameters {
  real<lower=0> lambda_sigma = lambda_sigma_raw;
  vector[G] lambda = lambda_mu + lambda_raw * lambda_sigma;

}
model {

  // Overall properties of the data
  lambda_mu ~ gamma(3,2);
  lambda_sigma_raw ~  gamma(6,5);
  //sigma ~ normal(0,2);
  tau ~ normal(0,2);
  //exposure_rate ~ normal(0,1);
  //sum(exposure_rate) ~ normal(0, 0.001 * N);

  // Gene-wise properties of the data
  lambda_raw ~ normal(0, 1);

  // Sample from data
   if(omit_data==0) for(n in 1:N) for(g in 1:G) counts[n,g] ~  poisson_inverse_gaussian(lambda[g], tau);

}

Although it is as optimised as I could I still get huge execution times

DIAGNOSTIC(S) FROM PARSER:
Info: left-hand side variable (name=p_arr_log) occurs on right-hand side of assignment, causing inefficient deep copy to avoid aliasing.

(negative binomial takes virtually 0 seconds to execute)

Should I abandon this probability distribution as too hard to compute, or you can see an easy way to optimise it?

bgoodri · November 12, 2018, 5:07am

Well, if you log-likelihood is just counts[n,g] ~ poisson_inverse_gaussian(lambda[g], tau); then you only need to evaluate it at the unique values of the counts and multiply the log-likelihood contributions by the number of times they occur for each lambda[g].

stemangiola · November 12, 2018, 6:03am

True, this is a straightforward way to speed it up. I imagine this works well if we have N of high dimensionality. Usually G ~= 15K N ~= 10-100.

Still, it is easy to implement. Thanks

stemangiola · November 12, 2018, 6:32am

Given the probability equation

Pubblication

With K -> modified bessel function second kind

I don’t see the reason to invoke recursion. The only one I can see is that K of a non-integer is needed.

Since I see that modified bessel function second kind is present in Stan, is it possible to calculate

modified_bessel_second_kind(x - 0.5, alpha)

without recursion?

(Having a fast version of this probability function will enable a whole lot of applied research on noisy count data).

bgoodri · November 12, 2018, 6:57am

The modified bessel function of the second kind in Stan is only defined for integer orders. However, the half-integer orders can be evaluated with a finite series, so you can easily define your own function to evaluate them. See http://functions.wolfram.com/Bessel-TypeFunctions/BesselK/introductions/Bessels/ShowAll.html
and in particular

stemangiola · November 12, 2018, 7:23am

Thanks a lot,

I will definitely try to implement it. Before that, let me ask about a possible alternative.

Considering that a summation once again is introduced, and v can be pretty big (~=100-100.000+), the calculation would become pretty expensive.

I assume is extremely impractical (and maybe stupid), but would it be possible to precompute all functions for realistic values of v < 1.000.000, e.g., (from wolfram-alpha BesselK[100-0.5, z];)

(1.253314137316 e^(-z) (1.000000000000 z^99 + 4950.000000000 z^98 + 1.224877500000×10^7 z^97 + 2.019822997500×10^10 z^96 + 2.496501224910×10^13 z^95 + 2.466543210211×10^16 z^94 + 2.028731790399×10^19 z^93 + 1.428516999268×10^22 z^92 + 8.78895083800×10^24 z^91 + 4.798767157545×10^27 z^90 + 2.353795290776×10^30 z^89 + 1.047438904395×10^33 z^88 + 4.263076340889×10^35 z^87 + 1.597669840985×10^38 z^86 + 5.545055540963×10^40 z^85 + 1.791052939731×10^43 z^84 + 5.406741061813×10^45 z^83 + 1.531061851269×10^48 z^82 + 4.080279833631×10^50 z^81 + 1.026297753943×10^53 z^80 + 2.442588654385×10^55 z^79 + 5.513271534184×10^57 z^78 + 1.182596744082×10^60 z^77 + 2.415068220415×10^62 z^76 + 4.703345359259×10^64 z^75 + 8.74822236822×10^66 z^74 + 1.556174171270×10^69 z^73 + 2.650683338397×10^71 z^72 + 4.328187222554×10^73 z^71 + 6.78182163424×10^75 z^70 + 1.020664155952×10^78 z^69 + 1.476670561112×10^80 z^68 + 2.055340837248×10^82 z^67 + 2.754156721912×10^84 z^66 + 3.555292309550×10^86 z^65 + 4.423799430883×10^88 z^64 + 5.308559317060×10^90 z^63 + 6.14645084170×10^92 z^62 + 6.86946755913×10^94 z^61 + 7.41374075805×10^96 z^60 + 7.72882474026×10^98 z^59 + 7.78537711641×10^100 z^58 + 7.57962072119×10^102 z^57 + 7.13365699039×10^104 z^56 + 6.49162786125×10^106 z^55 + 5.712632517901×10^108 z^54 + 4.861947023388×10^110 z^53 + 4.002313411380×10^112 z^52 + 3.186842053812×10^114 z^51 + 2.454518757364×10^116 z^50 + 1.828616474236×10^118 z^49 + 1.317679518200×10^120 z^48 + 9.18321264222×10^121 z^47 + 6.18913878453×10^123 z^46 + 4.033255441253×10^125 z^45 + 2.540950927990×10^127 z^44 + 1.547257618651×10^129 z^43 + 9.10438956659×10^130 z^42 + 5.175374551903×10^132 z^41 + 2.841192910782×10^134 z^40 + 1.505832242715×10^136 z^39 + 7.70196163487×10^137 z^38 + 3.800048490494×10^139 z^37 + 1.807737353335×10^141 z^36 + 8.28734592919×10^142 z^35 + 3.659181971814×10^144 z^34 + 1.555152338021×10^146 z^33 + 6.35755560274×10^147 z^32 + 2.498145378020×10^149 z^31 + 9.42778342661×10^150 z^30 + 3.414204426638×10^152 z^29 + 1.185354072065×10^154 z^28 + 3.941302289616×10^155 z^27 + 1.253658070752×10^157 z^26 + 3.810104055569×10^158 z^25 + 1.104930176115×10^160 z^24 + 3.053096539265×10^161 z^23 + 8.02528233178×10^162 z^22 + 2.003233935895×10^164 z^21 + 4.739296488845×10^165 z^20 + 1.060417589379×10^167 z^19 + 2.238659355356×10^168 z^18 + 4.447287914482×10^169 z^17 + 8.28910169121×10^170 z^16 + 1.444672009039×10^172 z^15 + 2.345467497028×10^173 z^14 + 3.531837684479×10^174 z^13 + 4.908036506362×10^175 z^12 + 6.25774654561×10^176 z^11 + 7.27023587434×10^177 z^10 + 7.63374766806×10^178 z^9 + 7.17236731449×10^179 z^8 + 5.95618329160×10^180 z^7 + 4.303822765543×10^181 z^6 + 2.650971682180×10^182 z^5 + 1.353390806166×10^183 z^4 + 5.498150150049×10^183 z^3 + 1.666449633108×10^184 z^2 + 3.349903854308×10^184 z + 3.349903854308×10^184))/z^(199/2)

And with an array of formulas (or switch like function) stored somewhere to avoid the summation loop and speed up the calculation, comparably with negbinomial_2?

Maybe a more reasonable alternative is to precompute just this part

for all possible values of j,v

andre.pfeuffer · November 12, 2018, 7:59am

The matern covariances uses the Bessel function. Although having a all options,
it’s good to test 1/2, 3/2 and 5/2 and squared exponential. I’m pretty sure, if
you make it flexible you’ll run into speed problems. That’s why I decided at that
time not to use Stan’s Bessel function.

stemangiola · November 12, 2018, 8:09am

Since they are often calculated as series/summations.

Would it be possible to truncate the summation (starting from high j = v - N), with an acceptable precision? For example taking the first N terms?

This would speed up incredibly for v >> 0

EDIT: the truncation (if possible) might need to be reversed, I’m thinking. I will test tomorrow.

stemangiola · November 13, 2018, 3:52am

Hello @bgoodri , I am not sure if this is a feasible route, but I tried to approximate the summation part from such function starting the summation from the ‘s’ term (e.g., s = 960:1000 instead of 0:1000)

approximated_modified_bessel_first_kind_log = function(z, v, s = 0)
  foreach(j = s:(v-1), .combine = c) %do% {
    ( (j + abs(v) - 0.5 + 1) %>% lgamma() ) - 
      ( lgamma(j + 1) + lgamma(-j + abs(v) - 0.5 + 1) ) - 
      j*(log(2) + log(z))
  } %>% matrixStats::logSumExp()


> approximated_modified_bessel_first_kind_log(10, 1000, 0) 
[1] 4305.99
> approximated_modified_bessel_first_kind_log(10, 1000, 960) 
[1] 4305.99
> 
> system.time(approximated_modified_bessel_first_kind_log(10, 1000, 0) )
   user  system elapsed 
  0.542   0.000   0.539 
> system.time(approximated_modified_bessel_first_kind_log(10, 1000, 960) )
   user  system elapsed 
  0.035   0.000   0.036

The question is:

is this a good approach to bring down the execution time for v>>0 (e.g., 100.000)?
If so, what is the precision I should look for, compatible with HMC/Stan?

Another approximation at the limit x → Inf is

https://dlmf.nist.gov/10.41

But has an error of 0.000484 for x = 1000

andre.pfeuffer · November 13, 2018, 12:10pm

There’s another approx from an Intel guy:

In Stan you can always take the Bayes approach, sample from an Inverse Gaussian and then
use this for the Poisson distribution as input. Then test your data and if appropriate optimize,
otherwise move on. Just my 2 cents.

Topic		Replies	Views
Vectorising further poisson_inverse_gaussian Modeling	3	398	March 22, 2019
Vectorized Implementations of Custom Distributions Modeling	2	360	May 12, 2021
Manually manipulating log likelihoods for duplicate observations Modeling specification	2	433	July 28, 2018
Log probability evaluates to negative infinity Modeling	6	637	June 11, 2021
Integrated Laplace approximation roadmap Developers	49	5105	January 13, 2025

Is it possible to speed up this poisson_inverse_gaussian

Related topics