Vectorizing _lpdf, _lcdf and the like by writing a custom function?

KariOlson · June 28, 2019, 9:09pm

I’m working on a model for sometimes censored data from sometimes truncated distributions. There is a lot of data and because it is collected discretized, it can be easily summarized as a value plus the number of times that value occurs. Even summarized, like this there is a lot of data so I wanted to apply vectorization. The below examples are not the full model, just a description of the kinds of things that can and can’t be done.

Unfortunately, a program like this is incorrect due to normal_lccdf returning the sum of the values rather than the vector of all of the values:

data {
  int N;
  vector[N] y;
  vecot[N] cnt;
}
parameters{
  real mu;
  real sigma;
}
model {
  target += cnt * normal_lccdf(y | mu, sigma);
}

When I correct the model with explicit for loops like this, it runs really slowly:

model {
  for (n in 1:N)
    target += cnt[n] * normal_lccdf(y[n] | mu, sigma);
}

Could I write a stan function that returns the vector of values in stead of the sum? I’ve seen a few examples of functions that return vectors, but those examples are typically built up with a for loop (which I assume wouldn’t help to vectorize the differentiation):

function{
  real my_function(y vector, mu real, sigma real){
    vector[nrows(y)] output;
    for (n in 1: nrows(y))
      output[n] = cnt[n] * normal_lcdf(y[n] | mu, sigma)
return(sum(output))
  }
}

I assume that, if this is possible, I’ll need to implement the underlying algebra directly in vectorized form. Any advice for doing this in a numerically robust way for the normal distribution?

KariOlson · June 30, 2019, 5:21pm

Well, whether a good idea or not, I tried and it seems to be working, maybe… I am having an issue with the larger application but it doesn’t seem to be the coding of these functions:

functions{
  vector normal_vlpdf(vector y, real mu, real sigma) {
    vector[num_elements(y)] z;
    vector[num_elements(y)] log_pdf;
    //pdf = 1/sqrt(2*pi)/sigma*exp(-1/2 * ((y - mu)/sigma)^2)
    //log(pdf) = log(1) - log(sqrt(2*pi)) - log(sigma) - 1/2 * ((y - mu)/sigma)^2
    //=0 - 0.5 * log(2+pi) - log(sigma) - 0.5 * ((y - mu)/sigma)^2
    //^2 doesn't vecttorize, there might be another way but x.*x works.
    //if one doesn't need the constant part, one can eliminate the 2pi term
    z = (y - mu) / sigma;
    log_pdf = - 0.5 * z .* z - log(sigma); //- 0.5 * log(2*pi());
    return (log_pdf);
  }
  vector normal_vlcdf(vector y, real mu, real sigma) {
    vector[num_elements(y)] z;
    vector[num_elements(y)] log_cdf;
    z = (y - mu) / sigma;
    //cdf = Phi_approx(z);
    //phi approx is an inverse_logit, one could implement with log_inv_logit(ploynomial)
    //log_cdf = log(Phi_approx(z));
    //log_cdf = log(Phi(z));
    return (log_cdf);
  }
  vector normal_vlccdf(vector y, real mu, real sigma) {
    vector[num_elements(y)] z;
    vector[num_elements(y)] log_ccdf;
    z = (y - mu) / sigma;
    //ccdf = 1-Phi_approx(z);
    //phi approx is an inverse_logit, one could implement with log1m_inv_logit(ploynomial)
    //log_ccdf = log1m(Phi_approx(z));
    log_ccdf = log1m(Phi(z));
    return (log_ccdf);
  }  
}

Topic		Replies	Views
How To Vectorize lccdf functions Modeling	1	406	April 6, 2022
Outputting lpdf functions as vectors instead of reals Developers	6	930	January 3, 2022
Vectorized cumulative normal distribution output General	2	337	December 31, 2020
Question about vectorization statement Modeling	3	551	August 21, 2017
Vectorising user-defined multivariate lpdf/lpmf functions Modeling performance , loo , paralellization	13	1339	November 7, 2020

Vectorizing _lpdf, _lcdf and the like by writing a custom function?

Related topics