# How can I convert my model to use `ordered_logistic_glm` from `ordered_logistic`?

I noticed that this OpenCL GPU routine for `ordered_logistic` (`ordered_logistic_glm_lpmf`) was added to stan-dev somewhat recently (https://github.com/stan-dev/math/pull/1252) and seems to promise some large performance gains.

Can this somehow be used as a drop in replacement for `ordered_logistic`? Our current model uses the ordered_logistic function and looks something like this:

``````  for(i in 1:N){
X = ...;
beta = ...;
y[i] ~ ordered_logistic(X*beta, cuts);
}
``````

I tried translating this model to use the _glm function like so, I basically put all the data into vectors and used those as inputs:

``````transformed parameters {
...

matrix[N,1] X;
vector[N] beta;

for(i in 1:N){
X[i,1] = ...;
beta[i] = ...;
}
}

model {
target += ordered_logistic_glm_lpmf(y | X, beta, cuts);
}
``````

However the inputs fail the bounds constraints

``````Exception: ordered_logistic_glm_lpmf: Weight vector has dimension = 3087, expecting dimension = 1; a function was called with arguments of different scalar, array, vector, or matrix types, and they were not consistently sized;  all arguments must be scalars or multidimensional values of the same shape.
``````

I guess it expects X to be of size N x k, where k is the length of the weight vector, but I donâ€™t fully understand why. Why do I need N x k weights to fit this model when I only needed k weights to fit `ordered_logistic`?

Should I maybe use an X where each row is my weight vector, repeated N times?

Thanks!

Jeff

Hi Jeff,

these GLMs are part of cmdstan 2.22. The following signatures are supported currently

real ordered_logistic_glm_lpmf(int, row_vector, vector, vector)
real ordered_logistic_glm_lpmf(int, matrix, vector, vector)
real ordered_logistic_glm_lpmf(int[], row_vector, vector, vector)
real ordered_logistic_glm_lpmf(int[], matrix, vector, vector)

This does have to be added to the Stan docs, together with the also recently added categorical_logit_glm_lpmf

As for the specific error I am going to tag @tadej who implemented this in Stan Math.

Thanks! Yes I am using cmdstan 2.22 as you say.

@jeff-predictwise Your first code snippet suggests you are trying to predict a vector `y` of length `N`. To do that using `ordered_logistic` you need a vector of weights `beta` which remains same over iterations. In each iteration you need a vector of observations `X` and of same length `k` as `beta` in each of `N` iterations.

If you want to do the same with `ordered_logistic_glm_lpmf` you need the same vector `beta` and place all `X`-es into a matrix that will have dimmensions `N` x `k`. Also if X does not depend on any weights it would be more efficient to construct it in `transformed_data` block.

Your second snippet shows you want to to have one dimmension of `X` equal to 1. If you have a single waight than `beta` should also have length 1. If you have `y` of length 1 or you want to reuse same values of `X` for every `y` you just have to transpose `X` into `matrix[1, N] X;`.

EDIT: actually last two options probably wonâ€™t work with a matrix. You need `row_vector[N] X;`

Hope that helps. If not you will need to be more precise about what exactely do you want to do.

1 Like

To clarify, in the original model, each y was predicted by a single value of X and single value of beta. Sorry I think I didnâ€™t make that clear.

Your suggestion to use row_vector instead of a matrix seems to work (pass the bounds constraint at least). Let me try running with that and see if the results of this vs the non-GLM model are the same.

Thanks,

Jeff

If that is the case I think you need to use `matrix X[N,1]` and `vector[1] beta`.

1 Like

If I read ordered_logistic_glm_lpmf correctly, itâ€™s input form

ordered_logistic_glm_lpmf(y, x, beta, cuts)

does not allow for non-linear relationships between x and beta. What I am looking for is a â€śrandom effectsâ€ť model, i.e. in pseudo-code

for (i in 1:nrow()){
eta = coefficient1[x1[i]]+coefficient2[x2[i]]
y[i] ~ ordered_logistic(eta, cutpoints)
}

Based on reading the documentation, question is whether this type of random effects model can be implemented in ordered_logistic_glm_lpmf at all. Thank you!

You could do that, but itâ€™d have to roll `eta` into `mu` and set `X` to be zero-dimensional and the predictor coefficient to be zero dimemsional.

it doesnâ€™t look like weâ€™ve vectorized the ordered logistic, or youâ€™d just be able to write this as:

``````y ~ ordered_logistic(coeff[x1] + coeff[x2], cutpoints);
``````

Also, you donâ€™t need to define intermediates. This can just be

``````for (n in 1:size(y))
y[n] ~ ordered_logisic(coeff[x1[n]] + coeff[x2[n]], cutpoints);
``````

The ordered logistic has been vectorised:

``````Available argument signatures for ordered_logistic:

int ~ ordered_logistic(real, vector)
int[ ] ~ ordered_logistic(vector, vector)
int[ ] ~ ordered_logistic(vector, vector[ ])
``````

Just need to be aware that the current versions of RStan and Pystan (2.19.x) have a bug in the vectorised code that causes very poor performance (more information and fix for RStan here). The bug has also been fixed in newer versions (2.20 and up), so cmdstanR and cmdstanPy could be used instead.

Bob, that was my intuition as well. I rewrote the model such that

Blockquote
target += ordered_logistic_glm_lpmf( %response_name% | eta_%response_name%, weight, c_%response_name%);

where eta_%response_name% is a matrix of size (N,1) derived by the same random effects over a number of responses, and a constant weight vector â€śweightâ€ť of 1. The speed-up of the model is remarkable, but we are still investigating whether this is due to the use of ordered_logistic_glm_lpmf per se (according to our logs, the GPU, while working, does very little work). Will keep you posted - also in our separate email chain with Andrew and David. Thx

Ok, while this works, we loose the major speed-up that ordered_logistic_glm_lpmf offers. The way I understand https://arxiv.org/pdf/1907.01063.pdf, the speed-up comes in at the matrix multiplication step X%*%beta, and as mentioned we donâ€™t have a linear relation in our model. We can â€ścheatâ€ť

but as mentioned, in this case the GPU does no work, and our speedup is non-existent. Any thoughts on whether we can make the GPU-based speed-up work in a scenario where we donâ€™t have natural matrix multiplication? My mind is going toward making the data matrix binary (but expanding the columns of the data) and likewise expanding the coefficient matrix. But, this would make harder establishing correlations for random-effects (e.g. corr across all education random effects etc.). Is it worth going down that path?