Hello everyone,
Just wanted to share a small trick for modelling cutpoints in ordinal models. I’ve seen the suggestion before that we can model cutpoints as transformed from simplex (theta), a scale (kappa) and a location (mu):
data{
int N;
int response[N];
vector[N] x;
parameters{
simplex[5] theta;
real<lower=0> kappa;
real mu;
real beta;
}
transformed parameters{
ordered[5] cutoffs;
cutoffs=cumulative_sum(theta)*kappa+mu;
}
model{
theta~dirichlet(rep_vector(1, 5));
kappa~exponential(.1);
for (n in 1:N){
response[n]~ordered_logistic(x[n]*beta , cutoffs );
}
…but this has the annoying feature that the prior center for the cutoffs is a hodgepodge function of theta, kappa and mu. So how about removing the median from the cum-sum, such that the parameter mu is the center of the cutpoint vector?:
functions{
real median(vector ordered_vector){
int midpoint;
midpoint = rows(ordered_vector)/2;
if(2*midpoint<rows(ordered_vector)){
return ordered_vector[midpoint+1];
}
else{
return (ordered_vector[midpoint]+ordered_vector[midpoint+1])/2;
}
}
vector cumsum_de_median(vector par_vector){
return cumulative_sum(par_vector)-median(cumulative_sum(par_vector));
}
}
data{
int N;
int response[N];
vector[N] x;
parameters{
simplex[5] theta;
real<lower=0> kappa;
real mu;
real beta;
}
transformed parameters{
ordered[5] cutoffs;
cutoffs=cumsum_de_median(theta)*kappa+mu;
}
model{
theta~dirichlet(rep_vector(1, 5));
kappa~exponential(.1);
for (n in 1:N){
response[n]~ordered_logistic(x[n]*beta , cutoffs );
}
The interpretation of the parameters makes more sense here (to me, anyways…):
- mu is the center of the cutpoint vector
- theta is the normalized distance between cutpoints
- kappa is the spread of cutpoints
Could use a mean too, I suppose.