I have built a **poisson_inverse_gaussian** lpmf. It is vectorised for intercept only models.

`int[] ~ poisson_inverse_gaussian_log(real, real);`

And it works pretty well

```
real poisson_inverse_gaussian_log_lpmf(int[] y, real log_mu, real sigma){
int y_max = max(y);
real sigma_mu_2_log = log(sigma) + log_mu + 0.6931472; // log(2)
real sigma_mu_2_1_log = log1p_exp(sigma_mu_2_log);
real sigma_mu_2_1_sqrt_log = 1.0/2 * sigma_mu_2_1_log;
vector[y_max + 1] p_arr_log;
int y_plus_1[size(y)];
for(i in 1:size(y)) y_plus_1[i] = y[i] +1;
// Start to create array
p_arr_log[1] = 1.0/sigma * (1-exp(sigma_mu_2_1_sqrt_log));
if(y_max>0) p_arr_log[2] = log_mu - sigma_mu_2_1_sqrt_log + p_arr_log[1];
if(y_max>1) {
real sigma_mu_2_log_sigma_mu_2_1_log = sigma_mu_2_log - sigma_mu_2_1_log;
real two_log_mu_sigma_mu_2_1_log = 2* log_mu - sigma_mu_2_1_log;
for(y_dot in 2:y_max) {
p_arr_log[y_dot + 1] =
log_sum_exp(
sigma_mu_2_log_sigma_mu_2_1_log + log(1-3.0/(2*y_dot)) + p_arr_log[y_dot],
two_log_mu_sigma_mu_2_1_log - log(y_dot) - log(y_dot-1) + p_arr_log[y_dot-1]
);
}
}
return sum(p_arr_log[ y_plus_1 ]);
}
```

However, in case of linear models, I will have

`int[] ~ poisson_inverse_gaussian_log(vector, real);`

from

`int[] ~ poisson_inverse_gaussian_log(a + b*x, real);`

I am not sure how to further vectorise it. Otherwise is practically not usable to build regression models.

P.S. This background on possible inference techniques might help to speed this up, however itâ€™s too involved for me at this stage.

Any suggestion in welcome.

Thanks.

1 Like

Try to avoid recursive `log_sum_exp`

calls.

You can change the log_sum_exp through log(sum(exp(â€¦))).

I hear the alarm bells ringing, numerical unstable, etc.

Not if you apply the log-sum-exp trick in your routine.

Since you have mu for each log-lik, you know depending on the *sign* of `mu`

which

will be your max element your log_sum_exp cascade.

https://www.xarg.org/2016/06/the-log-sum-exp-trick-in-machine-learning/

2 Likes

Thanks Andre,

I am trying to grasp you suggestion

you mean this one?

```
p_arr_log[y_dot + 1] =
log_sum_exp(
sigma_mu_2_log_sigma_mu_2_1_log + log(1-3.0/(2*y_dot)) + p_arr_log[y_dot],
two_log_mu_sigma_mu_2_1_log - log(y_dot) - log(y_dot-1) + p_arr_log[y_dot-1]
);
```

What is the connection with using?

My problem is that while I can save time in the case that I have **N** observations but **log_mu** and **sigma** are reals, with regression this is not true anymore, so this iterative loop has to be calculated for each pair observation - **log_hum** **N** times.

Are you saying that avoiding the â€śrecursionâ€ť (I cannot imagine how) I can do the calculation for N sigma_mu at the same time?

The log_sum_exp in Stan applies the log-sum-exp trick. Just have look in the source code of Stan.

The main problem is exp(x) with x large creates an overflow. So regarding your expression:

`log(1-3.0/(2*y_dot))`

, with log(x) and x from 0, â€¦ , 1 returns a small number

Same with `- log(y_dot) - log(y_dot-1)`

With the log_sum_exp trick we can rewrite above expression to:

```
p_arr_log[1] + log_sum_exp(
sigma_mu_2_log_sigma_mu_2_1_log + log(1-3.0/(2*y_dot)) + p_arr_log[y_dot] - p_arr_log[1] ,
two_log_mu_sigma_mu_2_1_log - log(y_dot) - log(y_dot-1) + p_arr_log[y_dot-1] - p_arr_log[1]
);
```

and then replace the log_sum_exp.

```
p_arr_log[1] + log(sum(exp(
sigma_mu_2_log_sigma_mu_2_1_log + log(1-3.0/(2*y_dot)) + p_arr_log[y_dot] - p_arr_log[1]) + exp(
two_log_mu_sigma_mu_2_1_log - log(y_dot) - log(y_dot-1) + p_arr_log[y_dot-1] - p_arr_log[1]
)));
```

From there I would use Mathematica and see what a full simplification of the sum looks like.