I am running a rather large ZIP GLM model indexed over time (19 years) and location (approx 5000), and am wondering if there are ways that the likelihood can be re-written to take advantage of vectorization and/or speed-ups with OpenCL GPU support.
My model is similar to the example from the docs (below is a simplification of my code, which also includes spatial effects):
data {
int<lower=0> N; // Number of locations
int<lower=0> T; // Number of years
int<lower=0> y[N,T]; // Outcome
real x[N]; // Covariate
}
parameters {
real alpha_theta; // Intercept, bernoulli
real alpha_lambda // intercept, poisson
real beta_theta; // Coefficient, bernoulli
real beta_lambda; // Coefficient, poisson
real sigma_theta[T,N]; // Error term, bernoulli
real sigma_lambda[T,N]; // Error term, poisson
}
transformed parameters{
real theta[T,N]; // Bernoulli param
real lambda[T,N]; // Poisson param
for (t in 1:T) {
// GLM eqns
theta[t] = alpha_theta + beta_theta * x[t] + sigma_theta[t];
lambda[t] = alpha_lambda + beta_lambda * x[t] + sigma_lambda[t];
}
}
model {
alpha_theta ~ std_normal();
alpha_lambda ~ std_normal();
beta_theta ~ std_normal();
beta_lambda ~ std_normal();
for (n in 1:T) {
sigma_theta[t] ~ std_normal();
sigma_theta[t] ~ std_normal();
for (t in 1:N){
if (y[t,n] == 0)
target += log_sum_exp(bernoulli_logit_lpmf(1 | theta[t,n]),
bernoulli_logit_lpmf(0 | theta[t,n])
+ poisson_log_lpmf(y[n] | lambda[t,n]));
else
target += bernoulli_logit_lpmf(0 | theta[t,n])
+ poisson_log_lpmf(y[t,n] | lambda[t,n]);
}
}
}
The documentation gives a neat example of pre-processing all of the bernoulli parts for the 0-count observations, to the tune of:
target
+= N_zero
* log_sum_exp(bernoulli_lpmf(1 | theta),
bernoulli_lpmf(0 | theta)
+ poisson_lpmf(0 | lambda));
target += N_nonzero * bernoulli_lpmf(0 | theta);
target += poisson_lpmf(y_nonzero | lambda);
How could my double-indexed example be extended to take advantage of the vectorization in the second snipped? I guess the hard part is getting the right indices of theta, lambda, and y to match up. Maybe would be ideal if I could use the bernoulli_glm_lpmf()
family of functions. Would appreciate any help here… thanks!