Induced Dirichlet and constraint transform functions

Just realised the induced_dirichlet() is now much simpler to implement by specifying a Dirichlet prior on the category probabilities themselves and then simply using the new simplex_unconstrain function, e.g.:

parameters {
  simplex[K] p;  // category probabilities
}
transformed parameters {
  vector[K - 1] tau = simplex_unconstrain(p);  // cutpoints
}
model {
  p ~ dirichlet(ones_vector(K));
}

EDIT: this doesn’t create tau as an ordered vector. But I believe going the other way around does:

parameters {
  ordered[K - 1] tau;  // cutpoints
}
transformed parameters {
  simplex[K] p = simplex_jacobian(tau);  // category probabilities
}

But I might ping @spinkney just in case.

2 Likes

I think you want to put a normal prior on the “raw” parameters. So it would be two calls to constraints.

data {
  int<lower=0> N;
}
parameters {
  vector[N - 1] x_raw;
}
transformed parameters {
  vector[N - 1] x = ordered_jacobian(x_raw);
  vector[N] p = simplex_jacobian(x);
}
model {
  x_raw ~ std_normal();
}

No that doesn’t work. Just like I did in How is the sum_to_zero_vector implemented under the hood? - #3 by spinkney you can overparameterize. But this results in N, not the N - 1 cutpoints :(

data {
  int<lower=0> N;
}
parameters {
  vector[N] x_raw;
}
transformed parameters {
  vector[N] z = ordered_jacobian(x_raw - mean(x_raw));
  vector[N] p = softmax(z);
}
model {
  x_raw ~ std_normal();
  target += -N * log_sum_exp(z);
}

You can get ordinal cutpoints with a prior that is marginally distributed symetrically by

data {
  int<lower=2> N; 
}
parameters {
  sum_to_zero_vector[N] z; 
}
transformed parameters {
  vector[N - 1] c;
  
  for (k in 1:N - 1) {
    // c[k] = log(Prob(y <= k) / Prob(y > k))
    c[k] = log_sum_exp(z[1:k]) - log_sum_exp(z[k + 1:N]);
  }
}
model {
   z ~ normal(0, sqrt(N / (N - 1.0)));
}

with 4 chains x 20k samples

  variable   mean median    sd   mad     q5   q95  rhat ess_bulk ess_tail
  <chr>     <dbl>  <dbl> <dbl> <dbl>  <dbl> <dbl> <dbl>    <dbl>    <dbl>
1 c[1]     -1.79  -1.79   1.29  1.29 -3.92  0.315  1.00   87424.   65451.
2 c[2]     -0.507 -0.507  1.10  1.09 -2.32  1.31   1.00   77257.   66261.
3 c[3]      0.499  0.500  1.11  1.10 -1.32  2.31   1.00   75492.   65816.
4 c[4]      1.79   1.78   1.29  1.29 -0.320 3.91   1.00   85362.   63385.
1 Like

Thanks! Might be best to stick to the Betancourt function for now… although it’s probably worth getting something like this in Stan, because the simplex-to-ordered-cutpoints type is quite nice for ordinal models.