Max Treedepth Exceeded = Lack of Convergence?

N(0, 10) is reasonable for alpha. Cutpoints are assumed known so there’s no prior on c.

The exact position of cutpoints is irrelevant as long as the prior on alpha is vague.
The spacing between cutpoints (let’s call it s) does two things

  1. the interpretation of beta is that beta/s is the expected difference in y between theta=0 and theta=1 individuals.
  2. The probability assigned to a predicted y value is at most (exp(s)-1)/(exp(s)+1)

The first effect could (and probably should) be removed by using alpha+s*beta*theta instead of alpha+beta*theta as the predictor.
If you think s=1 is too strong an assumption you can make s a parameter

...
parameters {
  real<lower=0> s[j];
  ...
}
transformed parameters {
  vector[k-1] c1 = s[1] * c;
  vector[k-1] c2 = s[2] * c;
  ...
}

s needs an informative prior, maybe lognormal(0, 0.5).

2 Likes