I have built a poisson_inverse_gaussian lpmf. It is vectorised for intercept only models.
int[] ~ poisson_inverse_gaussian_log(real, real);
And it works pretty well
real poisson_inverse_gaussian_log_lpmf(int[] y, real log_mu, real sigma){
int y_max = max(y);
real sigma_mu_2_log = log(sigma) + log_mu + 0.6931472; // log(2)
real sigma_mu_2_1_log = log1p_exp(sigma_mu_2_log);
real sigma_mu_2_1_sqrt_log = 1.0/2 * sigma_mu_2_1_log;
vector[y_max + 1] p_arr_log;
int y_plus_1[size(y)];
for(i in 1:size(y)) y_plus_1[i] = y[i] +1;
// Start to create array
p_arr_log[1] = 1.0/sigma * (1-exp(sigma_mu_2_1_sqrt_log));
if(y_max>0) p_arr_log[2] = log_mu - sigma_mu_2_1_sqrt_log + p_arr_log[1];
if(y_max>1) {
real sigma_mu_2_log_sigma_mu_2_1_log = sigma_mu_2_log - sigma_mu_2_1_log;
real two_log_mu_sigma_mu_2_1_log = 2* log_mu - sigma_mu_2_1_log;
for(y_dot in 2:y_max) {
p_arr_log[y_dot + 1] =
log_sum_exp(
sigma_mu_2_log_sigma_mu_2_1_log + log(1-3.0/(2*y_dot)) + p_arr_log[y_dot],
two_log_mu_sigma_mu_2_1_log - log(y_dot) - log(y_dot-1) + p_arr_log[y_dot-1]
);
}
}
return sum(p_arr_log[ y_plus_1 ]);
}
However, in case of linear models, I will have
int[] ~ poisson_inverse_gaussian_log(vector, real);
from
int[] ~ poisson_inverse_gaussian_log(a + b*x, real);
I am not sure how to further vectorise it. Otherwise is practically not usable to build regression models.
P.S. This background on possible inference techniques might help to speed this up, however it’s too involved for me at this stage.
Any suggestion in welcome.
Thanks.
1 Like
Try to avoid recursive log_sum_exp
calls.
You can change the log_sum_exp through log(sum(exp(…))).
I hear the alarm bells ringing, numerical unstable, etc.
Not if you apply the log-sum-exp trick in your routine.
Since you have mu for each log-lik, you know depending on the sign of mu
which
will be your max element your log_sum_exp cascade.
https://www.xarg.org/2016/06/the-log-sum-exp-trick-in-machine-learning/
2 Likes
Thanks Andre,
I am trying to grasp you suggestion
you mean this one?
p_arr_log[y_dot + 1] =
log_sum_exp(
sigma_mu_2_log_sigma_mu_2_1_log + log(1-3.0/(2*y_dot)) + p_arr_log[y_dot],
two_log_mu_sigma_mu_2_1_log - log(y_dot) - log(y_dot-1) + p_arr_log[y_dot-1]
);
What is the connection with using?
My problem is that while I can save time in the case that I have N observations but log_mu and sigma are reals, with regression this is not true anymore, so this iterative loop has to be calculated for each pair observation - log_hum N times.
Are you saying that avoiding the “recursion” (I cannot imagine how) I can do the calculation for N sigma_mu at the same time?
The log_sum_exp in Stan applies the log-sum-exp trick. Just have look in the source code of Stan.
The main problem is exp(x) with x large creates an overflow. So regarding your expression:
log(1-3.0/(2*y_dot))
, with log(x) and x from 0, … , 1 returns a small number
Same with - log(y_dot) - log(y_dot-1)
With the log_sum_exp trick we can rewrite above expression to:
p_arr_log[1] + log_sum_exp(
sigma_mu_2_log_sigma_mu_2_1_log + log(1-3.0/(2*y_dot)) + p_arr_log[y_dot] - p_arr_log[1] ,
two_log_mu_sigma_mu_2_1_log - log(y_dot) - log(y_dot-1) + p_arr_log[y_dot-1] - p_arr_log[1]
);
and then replace the log_sum_exp.
p_arr_log[1] + log(sum(exp(
sigma_mu_2_log_sigma_mu_2_1_log + log(1-3.0/(2*y_dot)) + p_arr_log[y_dot] - p_arr_log[1]) + exp(
two_log_mu_sigma_mu_2_1_log - log(y_dot) - log(y_dot-1) + p_arr_log[y_dot-1] - p_arr_log[1]
)));
From there I would use Mathematica and see what a full simplification of the sum looks like.