Generalized Dirichlet Distribution

guanhuac · February 27, 2018, 10:49pm

Can stan/Rstan generate Generalized Dirichlet Distribution, which is a generalization of Dirichlet Distribution for simplex data?

bgoodri · February 28, 2018, 4:06pm

You can implement the Generalized Dirichlet distribution as a Stan function in your Stan program but it is not implemented in the Stan Math library.

Bob_Carpenter · March 4, 2018, 11:06pm

The Wikipedia page for this the generalized Dirichlet needs some serious work. But it’s a simple enough form that it’d be easy to implement as a function directly in Stan. See the manual chapter on adding functions. Or you can just add the log density straight to the target.

avehtari · March 5, 2018, 4:53pm

There is also a case study showing how to add user defined distributions http://mc-stan.org/users/documentation/case-studies/gpareto_functions.html

guanhuac · May 16, 2018, 3:27am

Thank you all for the reply. I try to code the generalized Dirichlet by simply defining transformed parameters as Wiki suggest the generalized Dirichlet can be constructed by stick-breaking process. The code is modified by other people’s LDA implementation.

In R, I call the following functions, which is (generalized) LDA with 4 topics (since for my application the word position does not matter, hence I do some collapse. ), alpha is hyper parameter for topics and gamma1 and gamma2 are for generalized Dirichlet.

The code run with error as follows, and I wonder what is wrong with my implementation.
Thank you very much for your help.

" Rejecting initial value:
** Log probability evaluates to log(0), i.e. negative infinity.**
** Stan can’t start sampling from this initial value.**

Initialization between (-2, 2) failed after 100 attempts. **
** Try specifying initial values, reducing ranges of constrained values, or reparameterizing the model.
"

The following part is the code for R:

stan_data <- list(
K = 4,
V = ncol(x),
D = nrow(x),
n = x,
alpha = rep(1, 4),
gamma1 = rep(0.5, ncol(x)-1),
gamma2 = 0.5*(ncol(x) - seq(1:(ncol(x)-1)))
)

f3 <- stan_model(file = “lda_counts2.stan”)

stan_fit <- vb(
f3,
data = stan_data,
output_samples = 10,
eta = 1,
adapt_engaged = FALSE
)

This is in “lda_count2.stan”:

data {
  int<lower=1> K; // num topics
  int<lower=1> V; // num words
  int<lower=0> D; // num docs
  int<lower=0> n[D, V]; // word counts for each doc

  // hyperparameters
  vector<lower=0>[K] alpha;
  vector<lower=0>[V-1] gamma1;
  vector<lower=0>[V-1] gamma2;
}

parameters {
  simplex[K] theta[D]; // topic mixtures
  vector<lower=0,upper=1>[V-1] zeta[K]; // 
}


transformed parameters {
  simplex[V] beta[K];
  for (k in 1:K) {
	real tmpsum=0;
	beta[k,1] =  zeta[k,1];
    tmpsum =  zeta[k,1];
    for (m in 2:(V - 1)) {
		beta[k,m] = (1 - tmpsum)*zeta[k,m];
		tmpsum += beta[k,m];
    }
    beta[k,V] = 1 - tmpsum;
  }
}

model {
  for (d in 1:D) {
    theta[d] ~ dirichlet(alpha);
  }

  for (k in 1:K) {
    for (m in 1:(V - 1)) {
		zeta[k,m] ~ beta(gamma1[m],gamma2[m]);
    }
  }

  for (d in 1:D) {
    vector[V] eta;
    eta = beta[1] * theta[d, 1];

    for (k in 2:K) {
      eta = eta + beta[k] * theta[d, k];
    }
    n[d] ~ multinomial(eta);
  }

}

[edit: escape model code]

Bob_Carpenter · May 19, 2018, 11:03pm

Was there no other error message about what was failing? Are you using the latest version of Stan (we used to swallow some messages, and may still be swallowing some in RStan in 2.17).

Otherwise, you need to track through your program and figure out where the target density goes off the rails. Just drop in print statements like print("a ", target()); to print out the target density. It should always be finite.

Topic		Replies	Views
Generalized Dirichlet Distribution as a prior Modeling	16	2650	January 7, 2022
Two parameter distribution over the simplex Modeling	23	2808	April 13, 2018
Dirichlet distribution example not working Modeling	3	689	June 28, 2022
Generalized gamma distributions Developers specification	11	1725	December 7, 2023
Model Dirichlet proportions of a lognormal variable Modeling specification	0	298	December 7, 2022

Generalized Dirichlet Distribution

Related topics