I combined a factor model with Bradley Terry type indicators, model3.stan (3.1 KB). Each indicator is an ordinal measure on a symmetric 5-point scale (“much more”, “somewhat more”, “equal”, “somewhat less”, and “much less”). How would this be used? imagine that a bunch of people play a bunch of different board games like chess, checkers, othelo, etc. Various people win, lose, and tie matches. Using the attached model, each person gets a latent absolute ranking on a game (theta; chess ranking, checkers ranking, etc) and then these rankings are fed into a factor model to estimate a latent “board game ability” (flow).
My question is whether bayesplot has some posterior predictive checks that would be applicable to this kind of model. Are PPC-discrete the only methods available?
Thanks for sharing. These are neat models. I’ve seen people use factor models in the latent part of this for things like ideal point models of voting.
Have you tried this the other way around where everyone gets a latent board game ability according to a hierarchical model, then everyone also gets a difference from their generic ability for each specific game, again modeled hierarchically (some games might vary more from the mean than others)?
Or with something like “fixed effects” where you just estimate a single hierarchical multivariate normal prior for ability with covariance?
I see you’re also using soft identification of the effects here, which we’ve usually found to fit better (with the same results) than hard constraints for identification (like sum to zero). But I’d be careful to make sure the results aren’t sensitive to the prior.
There are probably other useful plots too but, just to make sure I understand what you’re interested in here, do you want to compare win/lose predictions to the actual win/lose outcomes? Do you care about predictions at the individual level or more about predictions aggregated in different ways?
I suppose, but the problem is that there are about 60 people involved and the sample size for any given pair of people is pretty small. There are almost 1000 comparisons in the dataset, but only a few person pairs have more than 20 data points. So we could look at ppc_bars for 6 pairs of people (out of 320 pairs) but I don’t think that would provide much insight into overall model fit.
I just wanted to be proactive in the event that reviewers ask for some kind of PPC. I was wondering whether I was missing something obvious. I don’t think it would make much sense to aggregate across people. I hadn’t considered aggregating across games. Aggregating across games would address the sample size problem. I have to think about whether aggregating across games makes sense theoretically.