Vectorized Implementations of Custom Distributions

Hello,
I wrote a custom Generalized Gaussian Distribution. It works fine and well, but is drastically slow, when compared to the provided normal distribution. Is there a way to make it faster, perhaps via a vectorized implementation?

functions {
  real generalized_gaussian_lpdf(real x, real mu, real scale, real beta){
    return log(beta) - log(2) - log(scale) - lgamma(1 / beta) - pow((fabs(x-mu) / scale), beta);
  }
}
1 Like

Make a real generalized_gaussian_vectorized_lpdf(vector x, vector mu, real scale, real beta) (assuming scale and beta is the same for all x, but mu is not).

Compute the things that are repeated for each x; e.g., lgamma(1/beta), log(scale), log(2), log(beta), etc.

Loop over x to accumulate lpdf into a real value.

I.e., precompute expensive and repeated things.

Something like this:

real generalized_gaussian_vectorized_lpdf(vector x, vector mu, real scale, real beta) {
  int N = num_elements(x);
  real log_beta = log(beta);
  real log_2 = log(2);
  real log_scale = log(scale);
  real lgbeta = lgamma(1 / beta);
  vector[N] xmu_scale = pow(fabs(x - mu) / scale, beta);
  real lpdf = 0.0;
  lpdf += N * (log_beta - log_2 - log_scale - lgbeta) - sum(xmu_scale);
  return(lpdf);
}

Note: I have not tested this, nor verified that this is the correct lpdf.

1 Like

@Stephen_Martin is spot on with an implementation recommendation, I’ll just add another angle: how do you measure speed? If you are looking only at time your whole model takes to run, the problem might not be the implementation, but the fact that the distribution results in posterior geometry that is hard to traverse. You can check this by comparing the actual treedepths and step sizes seen in the fits. An increase of 1 treedepth means that your whole model is evaluated twice as many times, so even a small-ish negative effect on exploration can easily overshadow most effects of implementation and I would look in this direction first.