# Understanding performance expectations of `_glm` distributions

Howdy,

I’m working on modeling star ratings (1-5 scale) with an item response theory style model, with effects for individual reviewers and titles being reviewed. My full dataset is rather large ~160k reviews across ~10k titles, and ~45k reviewers, so I’ve been particularly focused on trying to understand optimizing this model. I’ve primarily been sampling against a vectorized `ordered_logistic` function, which has worked pretty good, but I’m trying to understand if there are any other modifications I can make to improve performance.

In particular, I had the thought that trying to use the `_glm` variant of the ordered logistic likelihood would help out, but in my initial naive attempts it seemed to be worse performance – I am assuming because my data doesn’t naturally fit into the format; my currently likelihood is:

``````y ~ ordered_logistic(eta[g] - gamma[r], c);
``````

where `g` and `r` are index arrays (see the full model code below). I can do the trivial glm translation and write:

``````y ~ ordered_logistic_glm(rep_row_vector(1, N), eta[g] - gamma[r], c);
``````

and get something that works, but worse. Can I just assume that because of the kind of parameters I have in this model that the `glm` model gives me no benefit?

Other than that, does anyone have any suggestions on how this model could be sped up at all?

Appreciate any insight!

``````functions {
real induced_dirichlet_lpdf(vector c, vector alpha, real phi) {
int K = num_elements(c) + 1;
vector[K - 1] sigma = inv_logit(phi - c);
vector[K] p;
matrix[K, K] J = rep_matrix(0, K, K);

// Induced ordinal probabilities
p[1] = 1 - sigma[1];
for (k in 2:(K - 1))
p[k] = sigma[k - 1] - sigma[k];
p[K] = sigma[K - 1];

// Baseline column of Jacobian
for (k in 1:K) J[k, 1] = 1;

// Diagonal entries of Jacobian
for (k in 2:K) {
real rho = sigma[k - 1] * (1 - sigma[k - 1]);
J[k, k] = - rho;
J[k - 1, k] = rho;
}

return   dirichlet_lpdf(p | alpha)
+ log_determinant(J);
}

}
data {
int<lower=0> N; // number of observations
int<lower=1> G; // number of titles
int<lower=1> R; // number of reviewers
array[N] int<lower=1, upper=5> y; // ratings
array[N] int<lower=1, upper=G> g; // title index
array[N] int<lower=1, upper=R> r; // reviewer index
}
parameters {
real mu_eta;
real<lower=0> tau;
vector[G] eta_raw;
vector[R] gamma;
ordered[4] c;
}
transformed parameters {
vector[G] eta;
eta = mu_eta + tau*eta_raw;
}
model {
mu_eta ~ normal(0, 1);
tau ~ normal(0, 2);
gamma ~ normal(0, 2);
eta_raw ~ normal(0, 1);
c ~ induced_dirichlet(rep_vector(1, 5), 0);

y ~ ordered_logistic(eta[g] - gamma[r], c);
}
``````

`_glm` functions provide a speedup if matrix `x` and vector `beta` are big. Your `x` is row vector (ie, just one row matrix), and as is all 1’s it is not doing anything, and this case using `_glm` just adds that additional unnecessary multiplication by 1’s.

If your `ordered_logistic()` first term would be in a form of `x * beta` with big `x` and big `beta` (where elements of `beta` are not all 1), then you could get speedup.

The biggest speedup in ´-glm()`comes from reducing the size of autodiff expression tree. For example, if`x`is data and`beta`is parameter vector, then only the length of`beta`vector affects the size of the autodiff expression tree, and affects how much`_glm` provides speedup.

Awesome, thanks for the clarification Aki! I assumed it was something in this vein, but the confirmation is helpful

1 Like