Howdy,

I’m working on modeling star ratings (1-5 scale) with an item response theory style model, with effects for individual reviewers and titles being reviewed. My full dataset is rather large ~160k reviews across ~10k titles, and ~45k reviewers, so I’ve been particularly focused on trying to understand optimizing this model. I’ve primarily been sampling against a vectorized `ordered_logistic`

function, which has worked *pretty good*, but I’m trying to understand if there are any other modifications I can make to improve performance.

In particular, I had the thought that trying to use the `_glm`

variant of the ordered logistic likelihood would help out, but in my initial naive attempts it seemed to be worse performance – I am assuming because my data doesn’t naturally fit into the format; my currently likelihood is:

```
y ~ ordered_logistic(eta[g] - gamma[r], c);
```

where `g`

and `r`

are index arrays (see the full model code below). I can do the trivial glm translation and write:

```
y ~ ordered_logistic_glm(rep_row_vector(1, N), eta[g] - gamma[r], c);
```

and get something that works, but worse. Can I just assume that because of the kind of parameters I have in this model that the `glm`

model gives me no benefit?

Other than that, does anyone have any suggestions on how this model could be sped up at all?

Appreciate any insight!

```
functions {
real induced_dirichlet_lpdf(vector c, vector alpha, real phi) {
int K = num_elements(c) + 1;
vector[K - 1] sigma = inv_logit(phi - c);
vector[K] p;
matrix[K, K] J = rep_matrix(0, K, K);
// Induced ordinal probabilities
p[1] = 1 - sigma[1];
for (k in 2:(K - 1))
p[k] = sigma[k - 1] - sigma[k];
p[K] = sigma[K - 1];
// Baseline column of Jacobian
for (k in 1:K) J[k, 1] = 1;
// Diagonal entries of Jacobian
for (k in 2:K) {
real rho = sigma[k - 1] * (1 - sigma[k - 1]);
J[k, k] = - rho;
J[k - 1, k] = rho;
}
return dirichlet_lpdf(p | alpha)
+ log_determinant(J);
}
}
data {
int<lower=0> N; // number of observations
int<lower=1> G; // number of titles
int<lower=1> R; // number of reviewers
array[N] int<lower=1, upper=5> y; // ratings
array[N] int<lower=1, upper=G> g; // title index
array[N] int<lower=1, upper=R> r; // reviewer index
}
parameters {
real mu_eta;
real<lower=0> tau;
vector[G] eta_raw;
vector[R] gamma;
ordered[4] c;
}
transformed parameters {
vector[G] eta;
eta = mu_eta + tau*eta_raw;
}
model {
mu_eta ~ normal(0, 1);
tau ~ normal(0, 2);
gamma ~ normal(0, 2);
eta_raw ~ normal(0, 1);
c ~ induced_dirichlet(rep_vector(1, 5), 0);
y ~ ordered_logistic(eta[g] - gamma[r], c);
}
```