Vectorized `ordered_logistic` compiles but then does not sample correctly

I just noticed that the ordered_logistic function compiles (rstan 2.19.2) when the LHS is an integer array but then does something weird in the sampler. I’m a bit confused as to what is going on here so any info would be appreciated, I wonder if stan should fail during compilation in this case? It is very easy for a user to make a mistake here, I think.

I’ve attached simplified examples below, with a small number of iterations since the vectorized case is so slow.

Model with vectorized ordered_logit:

library(rstan)
library(rethinking)

data(Trolley)

model_code1 <- "
data {
    int<lower = 1> n;
    int<lower = 2> num_responses;
    int<lower = 1, upper = num_responses> response[n];
}

parameters {
    ordered[num_responses - 1] cut_points;
}

model {
    vector[n] linear_predictor = rep_vector(0.0, n);
    response ~ ordered_logistic(linear_predictor, cut_points);
}
"

mod1 <- stan(model_code = model_code1,
             iter = 200,
             warmup = 100,
             data = list(n = nrow(Trolley),
                         num_responses = length(unique(Trolley$response)),
                         response = Trolley$response))

> print(mod1)

Inference for Stan model: bb983badb3bce0d3a8dd5481b8a453c5.
4 chains, each with iter=200; warmup=100; thin=1; 
post-warmup draws per chain=100, total post-warmup draws=400.

                   mean se_mean      sd      2.5%       25%       50%       75%     97.5% n_eff   Rhat
cut_points[1]     -1.11    0.65    0.93     -2.09     -1.58     -1.40     -0.90      0.42     2 114.56
cut_points[2]     -0.73    0.51    0.72     -1.27     -1.25     -1.07     -0.56      0.49     2  69.00
cut_points[3]     -0.19    0.47    0.67     -1.06     -0.68     -0.28      0.41      0.70     2  26.16
cut_points[4]      0.29    0.56    0.80     -0.99      0.00      0.40      0.87      1.20     2  32.86
cut_points[5]      0.96    0.62    0.89     -0.48      0.59      1.27      1.55      1.99     2  27.77
cut_points[6]      1.61    0.56    0.80      0.26      1.40      1.89      2.20      2.34     2  25.48
lp__          -22604.89 1796.57 2553.30 -25697.12 -24600.76 -23104.69 -20417.95 -19087.92     2  34.34

Samples were drawn using NUTS(diag_e) at Thu Sep  5 16:14:40 2019.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at 
convergence, Rhat=1).

Model with for loop:

model_code2 <- "
data {
    int<lower = 1> n;
    int<lower = 2> num_responses;
    int<lower = 1, upper = num_responses> response[n];
}

parameters {
    ordered[num_responses - 1] cut_points;
}

model {
    vector[n] linear_predictor = rep_vector(n, 0.0);
    for (i in 1:n) {
         response[i] ~ ordered_logistic(linear_predictor[i], cut_points);
    }
}
"

mod2 <- stan(model_code = model_code2,
             iter = 200,
             warmup = 100,
             data = list(n = nrow(Trolley),
                         num_responses = length(unique(Trolley$response)),
                         response = Trolley$response))

> print(mod2)

Inference for Stan model: a276a6f50d33f30b0a0984159954e96a.
4 chains, each with iter=200; warmup=100; thin=1; 
post-warmup draws per chain=100, total post-warmup draws=400.

                   mean se_mean   sd      2.5%       25%       50%       75%     97.5% n_eff Rhat
cut_points[1]     -1.92    0.00 0.03     -1.97     -1.94     -1.92     -1.90     -1.85   226 1.00
cut_points[2]     -1.26    0.00 0.03     -1.32     -1.28     -1.27     -1.24     -1.21   470 1.00
cut_points[3]     -0.72    0.00 0.02     -0.76     -0.73     -0.72     -0.70     -0.67   767 0.99
cut_points[4]      0.25    0.00 0.02      0.21      0.23      0.25      0.26      0.29   454 1.00
cut_points[5]      0.89    0.00 0.02      0.85      0.88      0.89      0.91      0.94   388 1.01
cut_points[6]      1.77    0.00 0.03      1.71      1.75      1.77      1.79      1.83   591 1.00
lp__          -18926.01    0.13 1.79 -18930.46 -18926.93 -18925.65 -18924.69 -18923.60   183 1.01

Samples were drawn using NUTS(diag_e) at Thu Sep  5 16:15:46 2019.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at 
convergence, Rhat=1).

Hi @jeffpollock9

This is related to this bug that was resolved for the 2.20 release.
In the meantime (until Rstan 2.20) you can use the solution given by @andrjohns here: Ordered_logistic_lpmf

PS: It was nice meeting you at Stancon. Its always nice to put a face to the name :) Too bad I was kind of in a hurry.

2 Likes

Hi @rok_cesnovar thanks for the reply and apologies I missed those links while searching around.

Was good to meet you too - will hopefully be able to grab a beer at stancon 2020!

1 Like