I have a long likelihood calculation for a couple of integer features tx
and x
, that have a lot of duplicate observations. For example, I may have 1000 rows where tx=0 and x=0. I found that I can speed up the calculations a lot by just computing the log likelihood of the features once for each feature pattern (e.g., tx=0, x=0), and just multiply it by the number of observations with that pattern. This also requires reducing the size of the vector parameters p
and theta
by the same amount, having an entry for each feature pattern.
As an illustration, I converted something like the following model block that takes data with a large N
:
model {
p ~ beta(alpha, beta); // vector <lower=0,upper=1.0>[N] p;
theta ~ beta(gamma, delta); // vector <lower=0,upper=1.0>[N] theta;
for (n in 1:N) { // where N is big
real ll_lse;
ll_lse = long_log_likelihood(tx[n], x[n], theta[n], p[n]);
target += ll_lse;
}
}
to (line ll_lse *= n_custs[n]
added):
model {
p ~ beta(alpha, beta);
theta ~ beta(gamma, delta);
// where now, N is small and sum(n_custs) == previous N
for (n in 1:N) {
real ll_lse;
ll_lse = long_log_likelihood(tx[n], x[n], theta[n], p[n]);
ll_lse *= n_custs[n];
target += ll_lse;
}
}
While I verified that target gets incremented by the same amount overall in both models from this block, I’m worried that I may not fully be accounting for the data transformation in the p
and theta
sampling statements.
I tried modifying p ~ beta(alpha, beta);
to target += beta_lpdf(p | alpha, beta) .* n_custs;
, but it appears that beta_lpdf returns a real instead of a vector.
I can un-vectorize it in a for-loop:
for (n in 1:N) {
target += beta_lpdf(p[n] | alpha, beta) * n_custs[n];
target += beta_lpdf(theta[n] | gamma, delta) * n_custs[n];
}
but since I can’t print out the ‘target’ variable, I’m not sure if these programs are equivalent. Is there another factor I need to take into account for it to run on the reduced data?