Can someone double check to make sure my Stan code is correct for an ordered logistic regression with a horseshoe prior?

I’ve got an ordered logistic regression with 5 categories for my outcome variable. I am wanting to specify a Horseshoe prior for regularization to 0 (it used to be a LASSO, but I’ve seen the discourse here suggesting to go with a Horseshoe instead). Would the following STAN code correctly fit this model? I am new to STAN so I did a combination of looking at examples for ordered logistic regressions on the STAN manual and looking through here when people specified a horseshoe prior. I just want to be extra sure that I didn’t cross my wires somewhere and mess something up.

data {
    int<lower=0> N; // Number of observations
    int<lower=5> K; // Number of outcomes
    int<lower=1> D; // Number of predictors
    array[N] int<lower=1, upper=K> y;
   matrix[N, D] x; // Matrix for predictors

parameters {
    vector[D] beta; // Coefficients for predictors
    ordered[K-1] c; // Coefficients for each intercept
    real<lower=0> lambda; // Global shrinkage parameter
    vector<lower=0>[D] tau; // Local shrinkage parameters
    real<lower=0> sigma; // Scale parameter


model {
    // Horseshoe prior for Coefficients
    lambda ~ cauchy(0, 1);
    tau ~ cauchy(0, 1);
    sigma ~ cauchy(0, 1);
    beta ~ normal(0, lambda * tau * sigma);

    // Specify the likelihood
    Y ~ ordered_logistic(x * beta, c)

The vector c is usually called the vector of “cutpoints.” You might want to put a zero-avoiding prior on the differences to keep it from collapsing (having two identical entries). Something has to determine the scale between x * beta and c,and here you’re relying on a prior on beta to do that.

The horseshoe can be difficult to fit. I’ve never coded horseshoe myself, but your definition matches the one that @avehtari seemed OK with in another post. The whole thing looks dicey because of the three-way multiplicative non-identiifiability (you can multiply all the tau and divide the sigma and get the same distribution for beta). I’m actually surprised you can get anywhere with two standard half Cauchy multiplied like this.

The best thing to do in general to test models is to simulate data from them and see if you can recover the parameters. More formally, you can wrap this all up in simulation based calibration, but even an informal approach with one run can diagnose a lot of errors. And we usually recommend starting simple and building up. So maybe starting with a simple prior, then building up to horseshoe, but the whole code you have here isn’t too bad.

Thanks @Bob_Carpenter!

So for the vector c something like a dirichlet prior like what rstanarm does?