Multinomial with non-integer data

Suppose we wanted to implement this in Stan:

http://www.stat.columbia.edu/~gelman/research/published/misterp.pdf
deff

But multinomial_lpmf(int[] y, vector theta) doesn’t allow non-integer data. What do you recommend?

Thanks!!

vector[D] theta[N];
vector[D] y_real[N];
for(i in 1:N)
  target += sum(y_real[i] .* (theta[i] - log_sum_exp(theta[i])));

where N number of observations, D dimension of simplex, y_real …

But the paper talks about binomial.

Updated. Added sum(). Should work without too.
same: target += sum(y_real[i] .* log_softmax(theta[i])));

thanks so much!! the log-likelihood for the Multinomial is a bit different from your expression, I think? See e.g. p.272 of Agresti’s Categorical Data Analysis:

That’s just the J - 1 expression. Look at the PMF given in

https://en.wikipedia.org/wiki/Multinomial_distribution

image

p_1 + ... + p_k =1, the softmax.
log(p_i) = theta[i] - log\_sum\_exp(theta[i]))

The factor with n! / x1! … /x_k! can be omitted for calc. the likelihood, it
contains only constants.

For identifiable reasons your theta’s should sum-to-zero or contain one element being 0,
the reference element. Ref. Stan manual.

thanks so much!! got it.

(I didn’t catch at first that you switched theta from probabilities to logit scale)

Hi all – sorry to resurrect this thread, I was trying to implement this and I’m getting strange results so I thought I’d check in. I’m implementing as follows:

data {
    int<lower=1> N; // number of observations
    int<lower=2> J; // number of choices
    vector[J] y_real[N]; // number of supporters for each choice
}

parameters {
    vector[J] beta; // baseline rate of support
    real<lower = 0> beta_sd;
}

transformed parameters {
    matrix[N,J] mu_s; 
    vector[J] beta_s = beta * beta_sd;

    for(j in 1:J) mu_s[1:N,j] = beta_s[j] ;
}

model {
  beta ~ std_normal();
  beta_sd ~ std_normal();

  for (i in 1:N) target += sum(y_real[i] .* log_softmax(mu_s[i,1:J]'));
}

Does this look vaguely correct ? There are more parameters in the actual model but I’ve slimmed it down to the bare minimum.

Thank you !