# Optimizing Zero-Inflated Poisson GLM for vectorization/GPU

I am running a rather large ZIP GLM model indexed over time (19 years) and location (approx 5000), and am wondering if there are ways that the likelihood can be re-written to take advantage of vectorization and/or speed-ups with OpenCL GPU support.

My model is similar to the example from the docs (below is a simplification of my code, which also includes spatial effects):

``````data {
int<lower=0> N; // Number of locations
int<lower=0> T; // Number of years
int<lower=0> y[N,T]; // Outcome
real x[N]; // Covariate
}
parameters {
real alpha_theta; // Intercept, bernoulli
real alpha_lambda // intercept, poisson
real beta_theta; // Coefficient, bernoulli
real beta_lambda; // Coefficient, poisson
real sigma_theta[T,N]; // Error term, bernoulli
real sigma_lambda[T,N]; // Error term, poisson
}
transformed parameters{
real theta[T,N]; // Bernoulli param
real lambda[T,N]; // Poisson param
for (t in 1:T) {
// GLM eqns
theta[t] = alpha_theta + beta_theta * x[t] + sigma_theta[t];
lambda[t] = alpha_lambda + beta_lambda * x[t] + sigma_lambda[t];
}
}
model {
alpha_theta ~ std_normal();
alpha_lambda ~ std_normal();
beta_theta ~ std_normal();
beta_lambda ~ std_normal();
for (n in 1:T) {
sigma_theta[t] ~ std_normal();
sigma_theta[t] ~ std_normal();
for (t in 1:N){
if (y[t,n] == 0)
target += log_sum_exp(bernoulli_logit_lpmf(1 | theta[t,n]),
bernoulli_logit_lpmf(0 | theta[t,n])
+ poisson_log_lpmf(y[n] | lambda[t,n]));
else
target += bernoulli_logit_lpmf(0 | theta[t,n])
+ poisson_log_lpmf(y[t,n] | lambda[t,n]);
}
}
}
``````

The documentation gives a neat example of pre-processing all of the bernoulli parts for the 0-count observations, to the tune of:

``````   target
+= N_zero
* log_sum_exp(bernoulli_lpmf(1 | theta),
bernoulli_lpmf(0 | theta)
+ poisson_lpmf(0 | lambda));
target += N_nonzero * bernoulli_lpmf(0 | theta);
target += poisson_lpmf(y_nonzero | lambda);
``````

How could my double-indexed example be extended to take advantage of the vectorization in the second snipped? I guess the hard part is getting the right indices of theta, lambda, and y to match up. Maybe would be ideal if I could use the `bernoulli_glm_lpmf()` family of functions. Would appreciate any help here… thanks!

1 Like

Well, I now see the following warning against doing exactly this:

## 5.4 Vectorizing mixtures

1 Like

Before considering OpenCL I’d optimize the model.

if(y[t,n] == 0)

can be replaced by array of int Indices. Then we can get rid of the loop too.

Both are independent. Thus you can write them with
`poisson_log_glm` and `bernoulli_logit_glm` statements from which
OpenCL variants exists.

This statement could be vectorized by expanding the function of the distribution instead of
their function calls and if needed applying the log-sum-exp max trick, see LogSumExp - Wikipedia
for details. Depending on the number of zero observations in your data, which is usually
less than the rest, this may not worth the optimization.

Unfortunately we have many zeros in the dataset so both will need to be optimized. Here is what I came up with:

``````data {
int<lower=0> N; // Number of locations
int<lower=0> T; // Number of years
int<lower=0> y[N,T]; // Outcome
real x[N]; // Covariate
}
transformed data {
// indices of zero counts
int zero_idx[T,N];
// Max number of zero counts
int zero_max[T];
// indices of nonzero counts
int nonzero_idx[T,N];
// max number of nonzero counts
int nonzero_max[T];

// Loop through the y data and tally zeros and nonzeros appropriately
for (t in 1:T) {
zero_max[t] = 0;
nonzero_max[t] = 0;
for (n in 1:N){
if (y[t,n] == 0) {
zero_max[t] += 1;
zero_idx[t, zero_max[t]] = n;
}
else {
nonzero_max[t] += 1;
nonzero_idx[t, nonzero_max[t]] = n;
}
}
}
}
parameters {
real alpha_theta; // Intercept, bernoulli
real alpha_lambda // intercept, poisson
real beta_theta; // Coefficient, bernoulli
real beta_lambda; // Coefficient, poisson
real sigma_theta[T,N]; // Error term, bernoulli
real sigma_lambda[T,N]; // Error term, poisson
}
transformed parameters{
real theta[T,N]; // Bernoulli param
real lambda[T,N]; // Poisson param
for (t in 1:T) {
// GLM eqns
theta[t] = alpha_theta + beta_theta * x[t] + sigma_theta[t];
lambda[t] = alpha_lambda + beta_lambda * x[t] + sigma_lambda[t];
}
}
model {
vector[N] theta_inv_logit;
vector[N] lambda_exp;

alpha_theta ~ std_normal();
alpha_lambda ~ std_normal();
beta_theta ~ std_normal();
beta_lambda ~ std_normal();
for (n in 1:T) {
sigma_theta[t] ~ std_normal();
sigma_theta[t] ~ std_normal();

theta_inv_logit = inv_logit(theta[t]);
lambda_exp = exp(lambda[t]);

// Zeros
target += sum(log(
theta_inv_logit[zero_idx[t, 1:zero_max[t]]] +
(1-theta_inv_logit[zero_idx[t, 1:zero_max[t]]]) .*
exp(-lambda_exp[zero_idx[t, 1:zero_max[t]]])
));

// Nonzeros
target += bernoulli_lpmf(
rep_array(0, nonzero_max[t]) |
theta_inv_logit[nonzero_idx[t,1:nonzero_max[t]]]
) + poisson_lpmf(
y[t, nonzero_idx[t, 1:nonzero_max[t]]] |
lambda_exp[nonzero_idx[t, 1:nonzero_max[t]]]
);
}
}
``````

Before doing further optimizations, some parts needs some rework.

However

doesn’t match. x is defined by N the loops uses x[t], with t = 1,…,T.

This assigns every observation [T,N] a parameter. The error is usually covered with the
distribution function.

1 Like

Yes, you are right on both accounts. I did a sloppy job of simplifying the code from my true script, where these issues aren’t present. I’ll correct the above example in case anyone comes across it in the future.

1 Like