Finding the ordinal ratings with highest disagreement - is my model wrong?

Hi I am trying to find conditions in which participants disagree the most as I want to use them for a larger scale-experiment. As an example. Imagine 3 bottles of wine.

Bottle 1: Is always rated either 1 or 5 (Maximum disagreement)

Bottle 2: Is always rated exactly 3 (Minimum disagreement)

Bottle 3: Is always rated either 2 or 4 (Medium disagreement)

Now with just three bottles, it is easy to look at the data, and say that I should use bottle 1 but imagine that I had 50+ bottles and wanted the best handful or so.

My first idea was to model the bottles as follows:

brm(rating ~ bottle, family = cumulative(threshold = “flexible”)

And then make a prediction and pick the one(s) with the widest standard errors:

In this case, that indeed would have us select bottle 1.

However, I feel like I cannot trust this model too much as when I look at ordinal predictions, it puts a very low probability on ratings 1 and 5 for bottle 1, despite that ALL ratings are these two:

So all in all, my approach hunting for the biggest divergence correctly selects the condition where that is found. However, it is not good at actually describing the data. Can I do any better?

Thank you for reading trough it all.The full code is here