Better classification through mixture model regression

I was presented with “Machine Learning” problem where the “features” are used to “predict” a categorical outcome (possibly ordered, but that may be of lesser importance). Nevertheless, the observations are actually continuous and include multidimensional (and often correlated readouts), so a typical ML classification would require collapsing that into a set of labels and apply some classification method.

As an alternative, I thought a mixture model could accommodate that without artificially thresholding the output. However, because this kind of model involves discrete variables and likely unidentifiability of labels, I wonder if it’s a good fit (no pun intented) for a Stan implementation (it’s also probably just 2-3 categories or so, so maybe there’s a way of marginalizing them out and sample separately, I haven’t given it much thought).

If not, what are efficient alternatives. I often think ML \perp Bayesian, but in this case I think it would be incredibly informative to get uncertainty around the inferred class label and associated mean, so I thought I’d try to put in the extra effort.