I am trying to model the following process using a Bayesian model.

I would like to understand if my formulation makes sense, and what are the next steps to tackle it with Stan.

Let us suppose that there exist 3 populations k of coins in i ordered urns.

Each urn contains a large number of coins, and in each urn, the fraction of the 3 populations is the same.

Each population k is characterized by its own rate of success specific to urn i, which is known in the form of Beta(\alpha_{ik},\beta_{ik}).

Now the generative process is the following:

- For each urn i, n_i coins are drawn and tossed. The number of successes m_i and the number of tosses n_i for each urn are observed and together form a sample.

I would like to estimate the fraction of the 3 populations in a sample, given those two observed vectors, for a sufficient number of urns.

My intuition is that given some extreme probability values (selected a priori), the tosses will bring information about the population that generated them.

For example, if in urn 1 the population A has a very peaked Beta distribution near one, and B and C are peaked near 0, then observing 40/100 successes would likely suggest that something around 40% of the coins belong to population A (leaving uncertainty for the others, and thatās why we need many urns, with different characteristics).

I tried to model the process as follows (with 3 populations):

m_i \sim \sum_{k}^{3} BetaBin(n_{k},\alpha_{ik},\beta_{ik}) total number of successes for an urn

\mathbf{n_i} \sim DirMult(n_{i},\mathbf{c}) the total draws are split based on c, which lives on a simplex

Setting c to (1,1,1) as non informative prior.

I want to estimate c, the fractions.

First question: does this formulation make sense? To me, it seems the most faithful to the process.

Secondly, I have been trying to implement this in RStan, but the latent draw \mathbf{n_i} are integers and not admitted, so should be marginalized out. Even though I understand marginalization in simpler cases, I really canāt figure this out in my current formulation, given that \mathbf{n_i} is a vector of combinations rather than a single value.

Is there any similar problem in the literature, or any resource that can guide me through this marginalization?

From my understanding, here I should marginalize over all possible combinations that make up \mathbf{n_i}, and to me, this is not straight forward as summing a latent index or count.

I apologize if any of my notation is not clear, as I discovered Bayesian models only recently and I am still quite a beginner. I have a draft for the Stan code to achieve this, but I would like to be sure about the correctness of the model before diving into the implementation.

Thank you for your time!