Combining input data and parameters of DirichletMultinomial model

Hi,

I have the following hierarchical Dirichlet multinomial model;

data {
    int<lower=1> N;
    int<lower=1> K;
    array[N, K] int<lower=0> x;
}


parameters {
    real<lower=0> mu;
    real<lower=0> sigma;
    vector<lower=0>[K] alpha;
}


model {
    // prior
    mu    ~ lognormal(0, 5);
    sigma ~ normal(0, 2.5);
    alpha ~ normal(mu, sigma);

    // likelihood
    for (n in 1:N) {
       x[n, ] ~ dirichlet_multinomial(alpha);
    }
}

I want to determine the so-called “pseudo-counts”; \alpha_k + x_{n, k}
How do I construct this in the transformed parameters block?

I tried unsuccessfully the following

transformed parameters {
    array[N, K] real<lower=0> c = x * 1.0 + rep_array(alpha, N);
}

Please advice

You can do this but I’m not sure why you would. And they are going to generate a lot of output if N * K is large. If you’re not going to use them in the model, it’s much more efficient to define them in generated quantities as follows.

generated quantities {
  matrix[N, K] pseudo_counts;
  for (n in 1:N) {
    pseudo_counts[n] = alpha + x[n];
  }
}

The bigger issue in this model is the centered hierarchical model for alpha. I would think the following would be more effective, because you probably want multiplicative errors in the alph:

parameters {
  real mu; // no lower bound!!!
  real<lower=0> sigma;
  vector<offset=mu, multiplier=sigma>[K] alpha_raw;
...
transformed parameters {
  vector[K] alpha = exp(alpha_raw);
...
model {
  mu ~ normal(...);
  sigma ~ lognormal(...);
  alpha_raw ~ normal(mu, sigma);

Even in your original model, there’s no reason to constrain mu > 0. It’s too bad the dirichlet_multinomial isn’t vectorized yet in Stan or you could make this faster that way.

Thank you for your suggestion!

The diagnostics weren’t good so I decided to rewrite the model using the suggestions of this post


data {
    int<lower=1> N;
    int<lower=1> K;
    array[N, K] int<lower=0> x;
}


transformed data {
    real a = 1.0 / K;
    real b = 1.0;
}

parameters {
    vector<lower=0>[K] alpha;
}


model {
    // prior
    alpha ~ gamma(a, b);


    // likelihood
    for (n in 1:N) {
       x[n, ] ~ dirichlet_multinomial(alpha);
    }
}


generated quantities {
    array[N, K] int<lower=0> x_hat;

    for (n in 1:N) {
        x_hat[n, ]  = dirichlet_multinomial_rng(alpha, sum(x[n, ]));
    }
}

That may still be a challenge because the gamma(1/K, 1) has a mean of 1/K, which means that the mass is concentrated into the corners (i.e., alpha < 1) and will generate near-zero values for the components with alpha[k] < 1. It also has a mode at alpha = 0, which can be problematic. What kind of posterior estimates are you getting for alpha in this model?

Also, you can write the last line more simply as

x_hat[n] = dirichlet_multinomial_rng(alpha, sum(x[n]));