Dirichlet_log: probabilities is not a valid simplex

kbchoi · September 26, 2017, 3:03pm

get the weight of pi in each iteration for the expectation calculation

Did you mean that I am supposed to compute expectation on my parameters using pi in the generated quantities block? If so, don’t I need a weighted version of the formula for my parameters? I guess I am trying to figure out the best way to get/use the weight of pi.

Thanks @stijn, I learned a lot from your suggestion.

kbchoi · September 26, 2017, 6:44pm

A quick question. How would you implement calculation of the mean of each simulated group in stan? Boolean indexing on a vector does not seem to be supported by stan :(

stijn · September 27, 2017, 2:29am

I am starting to think that my suggestion might not really work for your question. I implicitly assumed that the model estimates different pi for each observations but that is not the case, I think. So your model is not going to be able to say whether one observation is more likely than another observation to be in group 1, 2, 3, or 4. In that case, each group’s expected mean n is going to be equal to the mean n for the whole sample.

kbchoi · September 27, 2017, 3:07am

In fact, each observations will have different pi like this.

generated quantities {
    matrix[N,4] pi_m;  // pi for each observation
    for (i in 1:N) {
        vector[4] pi_m_raw;
        for (j in 1:4) {
            pi_m_raw[j] = pi[j] * exp(beta_binomial_lpmf(x[i]|n[i], a[j], b[j]));
        for (j in 1:4)
            pi_m[i,j] = pi_m_raw[j]/sum(pi_m_raw);
    }
}

So IMHO your idea still holds. I am having hard time accessing different subarray per group (using boolean indexing like in R or numpy) after categorical_rng so that I can compute group-wise means easily.

generated quantities {
    int C[N];
    vector[4] gmeans;
    ...
    for (i in 1:N)
        C[i] = categorical_rng(to_vector(pi_m[i]));
    for (j in 1:4)
        gmeans[j] = mean(n[C==j]); // something like this?
}

stijn · September 27, 2017, 4:22am

Ok, sorry I have no intuition for mixture models. Once you have C, you could do all of this in R or python. I would not spend too much time trying to get it done in Stan.

This might work but there are probably safer and more efficient implementations.


vector[4] mean_group;
...
{
  vector[4] sum_group;
  vector[4] length_group;
  sum_group = rep_vector(0, 4);
  length_group = rep_vector(0, 4);
  for (i in 1:N){
    C[i] - catergorical_rng(to_vector(pi_m[i]));
    sum_group[C[i]] += n[i];
    length_group[C[i]] += 1;
  }
  //This might throw errors if some groups have no members.
  mean_group = sum_group / length_group; 
}

I am less sure of this but an implementation in expectations as per Bob’s suggestion would look something like the following

vector[4] mean_group;
...
{
  vector[4] sum_group;
  vector[4] length_group;
  sum_group = pi_m' * n // (4 x N) * (N x 1);
  for (i in 1:4){
    length_group[i] = sum(pi_m[,i]);
  }
  mean_group = sum_group / length_group;
}

kbchoi · September 27, 2017, 9:48am

Duh I think numpy/scipy spoiled me too much. I also keep forgetting this test is about means not neg_binomial parameters. Thank you so much for your help.

Bob_Carpenter · September 27, 2017, 6:47pm

You usually want to do all this on the log scale to avoid underflow issues.

Topic		Replies	Views
How to properly use Simplex with Dirichlet likelihood? Modeling	9	1043	November 2, 2020
2-state mixture model based upon inverse cloglog--invalid simplex error due to infinite or <0 values problem with initial values, or definition of transformed parameters? Modeling	0	291	November 25, 2018
Exception: bernoulli_logit_lpmf: Logit transformed probability parameter is nan, but must not be nan! Modeling specification	3	1207	January 14, 2020
Dirichlet error: updating 2 patch gaussian model to 4 patch Modeling	0	358	November 28, 2018
Random variable is nan, but must not be nan! Modeling	1	4873	August 9, 2017

Dirichlet_log: probabilities is not a valid simplex

Related Topics