Constraining Output of Multinomial Distribution

twitched · March 6, 2020, 9:03pm

@mathDR Did you get this working? I have a similar problem and am hoping you will share your solution.

Thanks.

mathDR · March 6, 2020, 9:11pm

Hey @twitched Unfortunately, no I didn’t. The likelihood was returning 0 (and log likelihood returning -infinty) for most/all inputs.

I am planning on revisiting this soon though and will update if I make any progress…

twitched · March 6, 2020, 9:14pm

@MathDR Did you try to use the Agner Fog code or did you try to implement it using the integrate_1d?

mathDR · March 6, 2020, 9:17pm

I tried both. The former I didn’t really get to work (I tried adding it as a function to have Stan call it).

most of the work though was to get the integrate_1d to work. And I didn’t get that at all.

twitched · March 6, 2020, 9:19pm

Thanks. If you find anything please post it. Or if you want to share your integrate_1d attempt I can attempt to see if I can get it working.

mathDR · March 6, 2020, 9:21pm

I’ll look at it again this weekend and post what I have

mathDR · March 9, 2020, 5:36pm

This was my latest failed attempt:

functions {

  real wallenius_pmf_integrand(real t, real xc, real[] theta, real[] x_r, int[] x_i) 
  {

    int num_categories = size(theta);

    vector[num_categories] y = to_vector(x_i[1:num_categories]);  // number of balls of each category that were drawn
    vector[num_categories] m = to_vector(x_i[num_categories+1:]); // number of balls of each category in the urn
    vector[num_categories] omega = to_vector(theta);  // weight of each category

    real d = sum(omega .* (m-y));

    real r = 1.0/d + 10.0;

    real ret_val = 0.0;

    for (i in 1:num_categories)
    {
      ret_val += y[i]*log1m( exp(r*omega[i]*log(t)) ); # Could reparameterize this to use exp1m
    }
    ret_val += log(r*d) + (r*d-1.0)*log(t);

    return exp(ret_val);
  }

  real MWNCHG_lpmf(int[] y, vector omega, int[] m, real[] xr, int[] xi) {
    // Multivariate Wallenius Non-Central Hypergeometric pmf
    // x = items drawn in each category
    // m = total number of items in each category
    // omega = weights of each category

    int num_categories = size(y);

    real theta[num_categories] = to_array_1d(omega);
    int one_array[num_categories] = rep_array(1,num_categories);

    real first_part;
    real second_part;

    first_part = sum(
      lgamma(to_vector(m)+to_vector(one_array)
        ) - lgamma(to_vector(y)+to_vector(one_array)
        ) - lgamma(to_vector(m)-to_vector(y)+to_vector(one_array))
        );
    second_part = integrate_1d(wallenius_pmf_integrand, 0., 1., theta, xr, xi, 0.01);
    return  first_part + log(second_part);

  }
  
  
}

data{
  int<lower=1> num_obs; // number of observations
  int<lower=1> num_categories; // number of categories 
  int<lower=0> selected[num_obs,num_categories]; // number selected  data
  int<lower=0> reserve[num_categories]; // how many of each color
  int<lower=0> x_i[num_obs,2*num_categories]; //input to distribution
  
  int<lower=0, upper=1> run_estimation;
}

transformed data {
  real x_r[0];
}

parameters {
  simplex[num_categories] omega;  // the inferred weights of the Multivariate Wallenius Non-Central Hypergeometric Distribution
}


model{
  omega ~ dirichlet(rep_vector(1.0/num_categories,num_categories));
  
  if (run_estimation==1){
    for(n in 1:num_obs){
      selected[n] ~ MWNCHG(omega, reserve, x_r, x_i[n]);
    }
  }
}```

mathDR · March 9, 2020, 5:37pm

I generate some sampled data then try to learn omega i.e. the weights

mathDR · July 10, 2020, 2:34am

Okay I got a working solution to the Wallenius non-central hypergeometric likelihood. The trick was to encode the integral (for now) with a left hand rule. So because we want the log of the pmf, we can use log_sum_exp and have a nice compact, numerically stable call.

Right now, I hardcoded the number of points and, as I said, am using a Left hand rule (the integral has a singularity at 1). I want to post my current solution to get anyone’s feedback (I will post my current reduce_sum solution – it isn’t much more complicated). Going forward, I would like to have a trapezoidal method, and have a variable number of points (where we can reuse previous function calls like in the tanh-sinh method). Lastly, I will implement choosing the optimal r for transforming the integral (ala eqn 19 from Agner Fog’s paper).

Anyway, here is the code:
functions {

  real approximate_wallenus_integral(vector omega, int[] y, int[] m) 
  {
    int num_categories = size(y);
    int num_points = 100;
    vector[num_categories] vec_y = to_vector(y);  // number of balls of each category that were drawn
    vector[num_categories] vec_m = to_vector(m); // number of balls of each category in the urn

    real d = sum(omega .* (vec_m-vec_y));
    real r = 33.0; // TODO: need to do this optimally
    real dt = 1./num_points;

    vector[num_points-1] ret_val;

    for (i in 1:num_points-1) 
    {
      ret_val[i] = (r*d-1.0)*log(i*dt) + sum(vec_y .* log1m_exp((r*omega)*log(i*dt))); // should do trapz
    }
    return (log(r) + log(d) + log_sum_exp(ret_val));
  }

  real MWNCHG_lpmf(int[] y, vector omega, int[] m) {
    // Multivariate Wallenius Non-Central Hypergeometric pmf
    // y = items drawn in each category
    // m = total number of items in each category
    // omega = weights of each category

    int num_categories = size(y);

    real theta[num_categories] = to_array_1d(omega);
    int one_array[num_categories] = rep_array(1,num_categories);

    real first_part;
    real second_part;

    first_part = sum(
      lgamma(to_vector(m)+to_vector(one_array)
        ) - lgamma(to_vector(y)+to_vector(one_array)
        ) - lgamma(to_vector(m)-to_vector(y)+to_vector(one_array))
        );
    second_part = approximate_wallenus_integral(omega,y,m);
    return  first_part + second_part;

  }

  real partial_sum(
    int[ , ] selected_slice,
    int start, 
    int end,
    data int[ , ] incount,
    vector omega
    ) 
  {
    real ret_sum = 0;
    for (n in start:end)
    {
      ret_sum += MWNCHG_lpmf(selected_slice[n] | omega, incount[n]);
    }
    return ret_sum;
    
  }
  
}

data{
  int<lower=1> num_obs; // number of observations
  int<lower=1> num_categories; // number of categories 
  int<lower=0> selected[num_obs,num_categories]; // number selected  data
  int<lower=0> incount[num_obs,num_categories]; // how many of each color
  
  int<lower=0, upper=1> run_estimation;
}

parameters {
  simplex[num_categories] omega;  // the inferred weights of the Multivariate Wallenius Non-Central Hypergeometric Distribution
}


model {
  int grainsize = 1;
  omega ~ dirichlet(rep_vector(1.0, num_categories));

  if (run_estimation==1){
    target += reduce_sum(partial_sum, selected, grainsize, incount, omega);
  }
}

I would love any suggestions!

mathDR · December 24, 2021, 3:10am

linking a different discussion that has an analytical integral solution to the Multivariate Wallenius Noncentral Hypergeometric distribution

Topic		Replies	Views
Truncated multinomial and dirichlet-multinoial Modeling	1	622	December 11, 2020
Hierarchical multinomial model with sparse data Modeling	3	452	October 20, 2022
Hints for Modeling a censored multinomial distribution? Modeling techniques	11	1095	April 27, 2019
A different kind of non-central hypergeometric distribution? Modeling	7	1358	January 26, 2022
Trying to generalise the dirichlet-multinomial (non-analytical) framework (replacing Dirichlet with other distributions) Modeling techniques , bioinformatics , dirichlet-multinomial	23	2621	November 5, 2018

Constraining Output of Multinomial Distribution

Related topics