This might be a silly question but I wanted to confirm something. According to the documentation, ordered_logistic_glm_lpmf()
has an OpenCL implementation which can use GPU, but ordered_logistic_lpmf()
is not on that list.
On the other hand, I did find this merged pull request that, if I understand it right, implements OpenCL for ordered_logistic_lpmf()
.
Can anyone clarify if ordered_logistic_lpmf()
is OpenCL friendly currently (as of 2.32.2)?
Thank you for the help.
For anyone finding this in the (near) future. It looks to me that ordered_logistic_lpmf
does not use GPU as of now. That said, writing my GLM code as the below option did lead to a clear speed up even with the to_matrix()
call (context: running it on the Ohio HPCs with a NVIDIA Tesla P100 GPU and a lot of data as rating
is a vector with 4 million items )
{
vector[1] ones;
ones[1] = 1;
target += ordered_logistic_glm_lpmf(rating | to_matrix(alpha[voter_id] +
beta[voter_id] .* theta[candidate_id]), ones, tau);
}