(non-)centered Hierarchical Dirichlet-Multinomial

Hi,

My dataset consists of darts thrown at different parts of a dartboard.
For some parts there is only a small amount of data. An example is the “double 1” (D1) as its likely that a skilled darts player is able to check out “earlier” on a higher score.
However, there is a enormous amount of data available for D20 and D18, the segments adjacent to D1.

I assume that if a darts players aims at D1 the following outcomes are possible: {D1, OS1, D20, OS20, D18, OS18, miss} where OS means outer single.

My idea was to borrow information from data-rich regions and create a hierarchical model.
Initially, I planned to create a basic Dirichlet-multinomial with N observations over K categories, but I found the following comment:

“It’d be much more efficient to code up the compound Dirichlet-multinomial by marginalizing out > the theta. That’s what you get in the beta-binomial, but we don’t have Dirichlet-multinomial built in.”

Inspired by some topics on this forum I wrote the following code:

functions {
    real dirichlet_multinomial_lpmf(int[] y, vector alpha) {}
        real sum_alpha = sum(alpha);
            return lgamma(sum_alpha) - lgamma(sum(y) + sum_alpha)
            // + lgamma(sum(y)+1) - sum(lgamma(to_vector(y)+1) // constant, may omit
            + sum(lgamma(to_vector(y) + alpha)) - sum(lgamma(alpha));
    }
}

data {
    int<lower=1> N;                        // number of target regions =3 {D1, D20 and D18)
    int<lower=1> K;                        // possible outcomes per target region =7
    array[N, K] int<lower=1> y;
}


parameters {
    real<lower=0> mu;
    real<lower=0> sigma;
    vector<lower=0>[K] alpha;
}


model {
    // prior
    mu    ~ lognormal(0, 5);
    sigma ~ normal(0, 2.5);
    alpha ~ normal(mu, sigma);

    // likelihood
    for (n in 1:N) {
        for (k in 1:K) {
           y[n, k] ~ dirichlet_multinomial(alpha[k]);
        }
    }
}

I read that when there is insufficient data in a hierarchical model, the variables being inferred end up having correlation effects, thereby making it difficult to sample. One solution is to obtain more data, but when this isn’t possible we should resort to reparameterization by creating a non-centered model from the centered model.

Does it make sense to reparameterize alpha by adding z \sim std\_normal()
Finally, how would dirichlet_multinomial_rng? look like?

Thank you

@HJA24 flagged this post for deletion as they realized Dirichlet multinomial is already implemented in Stan. I think it’s good if the post stays, there’s IMHO non-zero probability somebody will find the framing useful. It might make sense to edit the post to start with a note that the functions exist, but I am slightly against deletion at this point (if @HJAM24 insists, I’ll delete the post anyway)

@martinmodrak I’ll ask a modified question so you can close this one!

1 Like

This is primarily in the cases where you use a centered parameterization—the non-centered parameterization will remove this problem.

I would be tempted to just build a hierarchical model for the single, double, and treble values that partially pooled each of these independently (i.e., it partially pools the single probabilities just among themselves, and same for double and treble). The bullseyes will just have to be modeled as two separate outcomes. I don’t think you want to go looking at adjacent positions and restricting throws to them—a dart can physically go anywhere and if it doesn’t, we want that to follow from the model, not from a constraint.

Then I assume you have information on where the person is aiming as a covariate. With that, you can also model not only success, but also error patterns. You’d need to model the error patterns to have a fully generative model to generate dart throws. If you miss, where are you likely to miss? Given all the symmetries, this could also be modeled (what is probability of landing in each of the 61 other values).