Adding levels to a hierarchical Dirichlet-Multinomial model

Duncan_Stoddard1 · September 14, 2018, 4:46pm

Hello I’m building a hierarchical Dirichlet-Multinomial model to predict budget allocation splits across K alternatives.

The allocation split is a function of the total budget level, and this varies by several categories in a hierarchical structure (for example country, sector).

Currently the model runs ok with the betas and concentration parameter (kappa) varying by country (M parameter below), but I want to add more layers to the hierarchy (e.g. by country, by sector).

My questions are:

How can I optimise the current model model? Is this the best way to structure it for efficient sampling?
How can I extend the model hierarchy to include variation at additional multiple levels (e.g. by country and by sector)? It feels like a similar problem to the ‘seemingly unrelated regression’ in section 5.15 (Multivariate Outcomes) in the Stan manual. I’ve made a start but not sure how to handle the parameter correlations

All help and ideas much appreciated!

// ======================================================================
// MULTINOMIAL MODEL
// SPEND DATA
// ======================================================================

functions {

// for likelihood estimation
  real dirichlet_multinomial_lpmf(int[] y, vector alpha) {
    real alpha_plus = sum(alpha);
    return lgamma(alpha_plus) + sum(lgamma(alpha + to_vector(y)))
                - lgamma(alpha_plus+sum(y)) - sum(lgamma(alpha));
  }

}

data {
  int<lower = 1> K;                 // num alternatives
  int<lower = 1> N;                 // num observations
  int<lower = 1> M;                 // num countries
  int<lower = 0> y[N, K];           // allocation on each choice
  int<lower = 1> market[N];         // markets
  int<lower = 0> budget[N];         // total spend
  vector[N] log_budget;             // log spend
  int<lower = 1> N_new;             // new num observations
  int<lower = 1> budget_new[N_new]; // new budget for simulation 
  vector[N_new] log_budget_new;     // log new budget
}

parameters {
  matrix<lower = 0>[M,K] beta_0;
  matrix<lower = 0>[M,K] beta_1;
  vector<lower = 1>[M] kappa;

  vector<lower = 0>[K] mu_beta_0;
  vector<lower = 0>[K] mu_beta_1;
  vector<lower = 0>[M] mu_kappa;

  vector<lower = 0>[K] sigma_beta_0;
  vector<lower = 0>[K] sigma_beta_1;
  vector<lower = 0>[M] sigma_kappa;

}

transformed parameters {
  simplex[K] theta[N]; 
  vector[K] x_beta[N];

  // create XB index
  {
  for (n in 1:N) {
      for(k in 1:K){
        x_beta[n,k] = beta_0[market[n],k] + beta_1[market[n],k]*log_budget[n];
      }
      theta[n] = softmax(x_beta[n]);
    }
 }
}

model {

// priors
mu_beta_0 ~ lognormal(0,0.1);  
mu_beta_1 ~ lognormal(0,0.1);
mu_kappa ~ lognormal(0,1);
sigma_beta_0 ~ lognormal(0,0.1);
sigma_beta_1 ~ lognormal(0,0.1);
sigma_kappa ~ lognormal(0,0.1);

for (m in 1:M){
  to_vector(beta_0[m]) ~ gamma(mu_beta_0,sigma_beta_0);
  to_vector(beta_1[m]) ~ gamma(mu_beta_1,sigma_beta_1);
  to_vector(kappa) ~ lognormal(mu_kappa,sigma_kappa);
}

for (n in 1:N) {
      y[n] ~ dirichlet_multinomial(kappa[market[n]]*theta[n]);
}
}

generated quantities {
int<lower=0> channel_sim[M,N_new,K]; 
vector[K] theta_new[M,N_new];

{
  vector[K] alpha_new[N_new];
  int channel_sim_tmp[N_new,K];
  for (m in 1:M){
    for (n in 1:N_new){
      for(k in 1:K){
        alpha_new[n,k]   = beta_0[m,k] + beta_1[m,k]*log_budget_new[n];
      }
      theta_new[m,n]     = kappa[m]*softmax(to_vector(alpha_new[n]));
      channel_sim_tmp[n] = dirichlet_multinomial_rng(theta_new[m,n],budget_new[n]);
   }
channel_sim[m] = channel_sim_tmp;
 }
}
}

martinmodrak · September 17, 2018, 11:24am

Hi,
some (incomplete) suggestions, hope they will lead you in the right direction!

At first glance, there does not seem to be much to optimise. How fast is the model and big is the dataset? Some minor suggestions:

for (m in 1:M){
  ...
  to_vector(kappa) ~ lognormal(mu_kappa,sigma_kappa);
}

This is probably a mistake as you are adding the same kappa term M times - you probably should just put that statement out of the loop.

  to_vector(beta_0) ~ gamma(mu_beta_0,sigma_beta_0);
  to_vector(beta_1) ~ gamma(mu_beta_1,sigma_beta_1);
  to_vector(kappa) ~ lognormal(mu_kappa,sigma_kappa);

Also in transformed parameters

 for(k in 1:K){
        x_beta[n,k] = beta_0[market[n],k] + beta_1[market[n],k]*log_budget[n];
  }

could be replaced by

x_beta[n] = to_vector(beta_0[market[n],] + beta_1[market[n],]*log_budget[n]);

for a minor efficiency gain thanks to vectorization.

I am not sure what are you aiming for in here, but it does not look like multivariate outcomes to me. I can see two alternatives

you have per-sector-and-country data on budget totals - then it would make sense to simply predict splits per-sector-and-country with country as one of the predictors (possibly with a hierarchical prior) and the split by country is implied by summing all of the sectors
you want to model per-sector-and-country data as parameters depending on the per-country split. In this way you would just use the predicted splits per-country (including the observation model you have) and add a new split per-sector which just multiplies the per-country split by per-sector split. The per-country-and-sector prediction that would have its own observation model.

Hope that helps.

Duncan_Stoddard1 · September 21, 2018, 7:35am

Thank you Martin - really appreciate your comments. Will digest them see where I get to!

Duncan_Stoddard1 · October 4, 2018, 4:23pm

Thanks again for your response to this @martinmodrak .

On the last point - I think I may have worded my question poorly (or am not fully understanding your answer). All I really want to know is how should I structure my code to add another level onto the hierarchy.

To put it in the context of a more simple example, if we were modelling the impact of a continuous variable X on another continuous variable Y (e.g. sales), with panel data across subjects eg. states, we could could impose a hierarchical structure by putting a distribution on the betas for X.

What I want to know is, if we also had data on another categorical variable eg. product type, and also want our betas to vary at this level, what is the right way to incorporate that extra non-nested level? Do we put a further distribution on the means of the state distributions? Are there better ways?

Any further help much appreciated!

Thanks

Duncan

Topic		Replies	Views
Help expanding a Dirichlet-Categorical model Modeling rstan , specification , ecology , hierarchical-model , dirichlet-multinomial	0	405	September 10, 2023
(non-)centered Hierarchical Dirichlet-Multinomial Modeling	3	93	March 12, 2025
Hierarchical Dirichlet Toy Model Modeling dirichlet-multinomial	2	849	March 8, 2021
Hierarchical Multinomial Model Optimization/Specification Modeling performance , multinomial-response , hierarchical-model	15	1441	May 7, 2020
Combining input data and parameters of DirichletMultinomial model Modeling	3	44	March 12, 2025

Adding levels to a hierarchical Dirichlet-Multinomial model

Related topics