Hi folks,

I’m trying to translate a field-specific theory (usually implemented in excel, of all places…) into Stan to take advantage of the quantification of uncertainty that goes along with Bayes. I’m having some trouble wrapping my head around how best to go about it, and so would love some feedback / pointers towards a minimal working example, please.

The general setup is this:

I’m modelling experimentally-obtained counts of binary data (ex., did something happen on a trial or not - heads or tails), which the theory specifies as coming from one of two processes, which share partially-overlapping predictors. One of these processes is a binomial outcome (call it Option 1) with some set of predictors (say, A, B, and C); the other is a four-category categorical outcome with a partially-overlapping set of predictors (say, B, D, and E). However, we only observe some aspects of the outcome of the four-category outcome, such that two of them look identical yielding what is effectively two observable outcomes (heads or tails) with four different distinct sources. We know which of the datapoints was generated the binomial, simpler model, and which one was generated from the four-category model that was then collapsed to a binary outcome. We don’t know, however, which of the four outcomes was used on any given trial, just that if we saw heads on a trial we know comes from the four-outcome distribution, it was one of the two ways of getting heads, and mutatis mutandis for tails.

Would someone be willing / able to provide a small working example for this? I’m particularly hung up on how exactly to implement the part where you have four distinct outputs from the categorical model but only enough information is known to narrow down to two choices.

Thanks very much for the time / generosity of anyone who wants to help me out - please let me know if this isn’t clear or if I can provide more info.