# Vectorizing _lpdf, _lcdf and the like by writing a custom function?

I’m working on a model for sometimes censored data from sometimes truncated distributions. There is a lot of data and because it is collected discretized, it can be easily summarized as a value plus the number of times that value occurs. Even summarized, like this there is a lot of data so I wanted to apply vectorization. The below examples are not the full model, just a description of the kinds of things that can and can’t be done.

Unfortunately, a program like this is incorrect due to normal_lccdf returning the sum of the values rather than the vector of all of the values:

``````data {
int N;
vector[N] y;
vecot[N] cnt;
}
parameters{
real mu;
real sigma;
}
model {
target += cnt * normal_lccdf(y | mu, sigma);
}
``````

When I correct the model with explicit for loops like this, it runs really slowly:

``````model {
for (n in 1:N)
target += cnt[n] * normal_lccdf(y[n] | mu, sigma);
}
``````

Could I write a stan function that returns the vector of values in stead of the sum? I’ve seen a few examples of functions that return vectors, but those examples are typically built up with a for loop (which I assume wouldn’t help to vectorize the differentiation):

``````function{
real my_function(y vector, mu real, sigma real){
vector[nrows(y)] output;
for (n in 1: nrows(y))
output[n] = cnt[n] * normal_lcdf(y[n] | mu, sigma)
return(sum(output))
}
}
``````

I assume that, if this is possible, I’ll need to implement the underlying algebra directly in vectorized form. Any advice for doing this in a numerically robust way for the normal distribution?

Well, whether a good idea or not, I tried and it seems to be working, maybe… I am having an issue with the larger application but it doesn’t seem to be the coding of these functions:

``````functions{
vector normal_vlpdf(vector y, real mu, real sigma) {
vector[num_elements(y)] z;
vector[num_elements(y)] log_pdf;
//pdf = 1/sqrt(2*pi)/sigma*exp(-1/2 * ((y - mu)/sigma)^2)
//log(pdf) = log(1) - log(sqrt(2*pi)) - log(sigma) - 1/2 * ((y - mu)/sigma)^2
//=0 - 0.5 * log(2+pi) - log(sigma) - 0.5 * ((y - mu)/sigma)^2
//^2 doesn't vecttorize, there might be another way but x.*x works.
//if one doesn't need the constant part, one can eliminate the 2pi term
z = (y - mu) / sigma;
log_pdf = - 0.5 * z .* z - log(sigma); //- 0.5 * log(2*pi());
return (log_pdf);
}
vector normal_vlcdf(vector y, real mu, real sigma) {
vector[num_elements(y)] z;
vector[num_elements(y)] log_cdf;
z = (y - mu) / sigma;
//cdf = Phi_approx(z);
//phi approx is an inverse_logit, one could implement with log_inv_logit(ploynomial)
//log_cdf = log(Phi_approx(z));
//log_cdf = log(Phi(z));
return (log_cdf);
}
vector normal_vlccdf(vector y, real mu, real sigma) {
vector[num_elements(y)] z;
vector[num_elements(y)] log_ccdf;
z = (y - mu) / sigma;
//ccdf = 1-Phi_approx(z);
//phi approx is an inverse_logit, one could implement with log1m_inv_logit(ploynomial)
//log_ccdf = log1m(Phi_approx(z));
log_ccdf = log1m(Phi(z));
return (log_ccdf);
}
}
``````