Suppose we wanted to implement this in Stan:
http://www.stat.columbia.edu/~gelman/research/published/misterp.pdf
But multinomial_lpmf(int[] y, vector theta)
doesn’t allow non-integer data. What do you recommend?
Thanks!!
Suppose we wanted to implement this in Stan:
http://www.stat.columbia.edu/~gelman/research/published/misterp.pdf
But multinomial_lpmf(int[] y, vector theta)
doesn’t allow non-integer data. What do you recommend?
Thanks!!
vector[D] theta[N];
vector[D] y_real[N];
for(i in 1:N)
target += sum(y_real[i] .* (theta[i] - log_sum_exp(theta[i])));
where N number of observations, D dimension of simplex, y_real …
But the paper talks about binomial
.
Updated. Added sum(). Should work without too.
same: target += sum(y_real[i] .* log_softmax(theta[i])));
thanks so much!! the log-likelihood for the Multinomial is a bit different from your expression, I think? See e.g. p.272 of Agresti’s Categorical Data Analysis:
That’s just the J - 1 expression. Look at the PMF given in
https://en.wikipedia.org/wiki/Multinomial_distribution
p_1 + ... + p_k =1, the softmax.
log(p_i) = theta[i] - log\_sum\_exp(theta[i]))
The factor with n! / x1! … /x_k! can be omitted for calc. the likelihood, it
contains only constants.
For identifiable reasons your theta’s should sum-to-zero or contain one element being 0,
the reference element. Ref. Stan manual.
thanks so much!! got it.
(I didn’t catch at first that you switched theta
from probabilities to logit scale)
Hi all – sorry to resurrect this thread, I was trying to implement this and I’m getting strange results so I thought I’d check in. I’m implementing as follows:
data {
int<lower=1> N; // number of observations
int<lower=2> J; // number of choices
vector[J] y_real[N]; // number of supporters for each choice
}
parameters {
vector[J] beta; // baseline rate of support
real<lower = 0> beta_sd;
}
transformed parameters {
matrix[N,J] mu_s;
vector[J] beta_s = beta * beta_sd;
for(j in 1:J) mu_s[1:N,j] = beta_s[j] ;
}
model {
beta ~ std_normal();
beta_sd ~ std_normal();
for (i in 1:N) target += sum(y_real[i] .* log_softmax(mu_s[i,1:J]'));
}
Does this look vaguely correct ? There are more parameters in the actual model but I’ve slimmed it down to the bare minimum.
Thank you !