Finite mixture ordered vectors

bayes · April 18, 2021, 6:03am

I’ve a fundamental question regarding the finite mixture chapter of the Stan user guide:

In this section, we have an ordered ‘x’ vector as follows to allow identification of the group parameters:

parameters {
...
  ordered[K] x;             // locations of mixture components
 ...
}
model {
...
  x ~ normal(0, 10);
...
}

However, in another section of the Stan manual, it states that unconstrained vectors y are transformed into ordered sequences x by defining each element y_k as the log of the difference between successive elements of x.

My questions are:

Wouldn’t doing the above transform imply that x is no longer drawn from the normal distribution that we specified? Rather, it is the log differences that are drawn from the normal distribution? Or is there something that I’m missing here?
Would Stan be able to discard draws that are not ordered and only retain draws that are ordered, thereby circumventing the above transformations?

Thanks in advance!

martinmodrak · April 23, 2021, 10:47am

Hi, sorry for not getting to you earlier, your question is relevant and well written.

There are IMHO two misunderstandings here:

Stan doesn’t “draw” from distributions when sampling. A Stan program is (if we ignore the generated quantities block for now) fundamentally a function that takes data and unconstrained parameters as input and returns the posterior log density (up to a constant) as a result and computes all the derivatives of the density with respect to the parameters. Nothing more, nothing less. The Stan sampler (or optimize/ADVI) than calls this function and do their thing with the results. But there are no “draws”, just computing a function.
When transforming from unconstrained to constrained space, Stan automatically adds the Jacobian of the transform which ensures that the model behaves as if you actually computed the density starting with already constrained parameters.

EDIT: Previous version contained an error in the interpretation.

So in some sense, the code computes the density of a joint distribution over unconstrained vectors of size K that has the peculiar property that after transforming to ordered vectors, it gives the same distribution as if we drawn i.i.d vectors from normal(0, 10) and kept only those that are ordered. Alternatively, the same distribution could be in this case created by drawing i.i.d vectors from normal(0, 10) and sorting them.

A quick example showing the distributions are the same in cmdstanr:

sc = "
data {
  int<lower=1> K;
}

parameters {
  ordered[K] x;             // locations of mixture components
}
model {
  x ~ normal(0, 10);
}
"

m <- cmdstan_model(write_stan_file(sc))

K = 5
res <- m$sample(data = list(K = K), refresh = 0)

res$cmdstan_summary()

#                     Mean     MCSE  StdDev    5%       50%   95%    N_Eff  N_Eff/s    R_hat
# x[1]            -1.2e+01     0.15     6.4   -22  -1.1e+01  -1.6     1765     4193      1.0
# x[2]            -5.0e+00    0.097     5.5   -14  -5.0e+00   3.8     3253     7726      1.0
# x[3]            -6.2e-02    0.085     5.3  -9.0  -1.6e-03   8.5     3936     9348      1.0
# x[4]             4.9e+00    0.085     5.6  -4.3   4.8e+00    14     4405    10464     1.00
# x[5]             1.1e+01     0.10     6.8  0.84   1.1e+01    23     4462    10598     1.00

```r
test <- matrix(rnorm(4000*K, sd = 10), nrow = 4000, ncol = K)
test_ord <- matrix(apply(test, MARGIN = 1, FUN = sort), nrow = 4000, ncol = K, byrow = TRUE)
for(k in 1:K) {
  print(quantile(test_ord[,k], probs = c(0.05,0.5, 0.95)))
}
#         5%        50%        95% 
# -23.359041 -11.378643  -1.339266 
#         5%        50%        95% 
# -14.423358  -4.898484   3.855805 
#          5%         50%         95% 
# -8.99006986  0.03509366  8.96703966 
#        5%       50%       95% 
# -4.145248  4.783120 14.412439 
#        5%       50%       95% 
#  1.386047 11.293186 23.222455

test2 <- matrix(rnorm(1e6*K, sd = 10), nrow = 1e6, ncol = K)

ordered_indices <- !apply(test2, MARGIN = 1, FUN = is.unsorted)
if(sum(ordered_indices) < 1000) {
  message("Too few orderd left")
}

test_ord_reject <- test2[ordered_indices,]
for(k in 1:K) {
  print(quantile(test_ord_reject[,k], probs = c(0.05,0.5, 0.95)))
}
#        5%       50%       95% 
# -23.35957 -11.21584  -1.14037 
#         5%        50%        95% 
# -14.294742  -4.698611   4.024895 
#         5%        50%        95% 
# -8.8831785  0.1206282  8.8272902 
#        5%       50%       95% 
# -3.876497  4.842877 14.229799 
#        5%       50%       95% 
#  1.238684 11.276357 23.249097

Hope that clarifies more than confuses! :-)

Topic		Replies	Views
What is actually happening when sampling ordered parameters? Modeling	9	847	November 16, 2020
How can I enforce ordering on normally distributed parameters Modeling	8	541	March 29, 2021
Help with manually constructing ordered vectors Modeling	0	302	November 16, 2022
"ordered" vector in Stan not strictly ordered General	2	643	May 6, 2022
Implementation of the 'ordered' type Developers	2	395	June 4, 2019

Finite mixture ordered vectors

Related topics