Dirichlet distribution example not working

Dear stan-community,
this is my first time asking, any guidance on how to make my issue clearer is very welcome!

I’m currently working on a mixing model that is supposed to figure out the contributions P of different sources to a mixture. I have five different sources, so P will always be a vector of length 5, summing up to 1. Until now I use a workaround with an isometric log-ratio to construct the priors for P, which works fine.
I would now like to use a Dirichlet-prior directly so I tried to read up on it here: 25.1 Dirichlet distribution | Stan Functions Reference

copying the stan code

data {
   int<lower=1> K;
   real<lower=0> alpha;
 }
 generated quantities {
   vector[K] theta = dirichlet_rng(rep_vector(alpha, K));
 }

and using this R-script

library(rstan)
K <- 5
alpha <- 1
setwd("C:/mypath")
options(mc.cores = parallel::detectCores())
mystanfile <- "dirichlet_test.stan" # this is how I called the stan file
mydata <- list(K = K, alpha = alpha)
output <- stan(file = mystanfile, data = mydata, iter = 4000, chains = 3,
               warmup = 2000, algorithm = "NUTS", 
               sample_file = "sample_dirichlet_test", 
               diagnostic_file = "dignostics_dirichlet_test",
               control = list(adapt_delta = 0.9, max_treedepth = 10),
               ) 
save(output,file="dirichlet_test.RData")

resulted in this message: “Stan model ‘dirichlet_test’ does not contain samples.”
As you can probably see I am a complete beginner and unfortunately the problem is not obvious to me.
Once this is running I would like to try to get the complete model to work myself. But I fear I will have to come back to you with issues I encounter along the way.
Anyway, I’m really looking forward to any help!
Thanks a lot in advance.

This error is because you don’t have any parameters in your model - you’re essentially just calling the RNG function.

I think you probably want something more like this:

data {
   int<lower=1> K;
   real<lower=0> alpha;
 }

transformed data {
  // Specifying this in the transformed data block allows the
  // vector to be created only once, and then re-used
  vector[K] dirichlet_arg = rep_vector(alpha, K);
}

parameters {
  vector[K] theta;
}

model {
  theta ~ dirichlet(dirichlet_arg);
}
1 Like

Hi,
thanks a lot for taking the time to answer my question. Also, the comment on the transformed data block is much appreciated and something I didn’t know. Using the stan code you proposed I now get a different error.

SAMPLING FOR MODEL ‘dirichlet_test’ NOW (CHAIN 1).
Chain 1: Rejecting initial value:
Chain 1: Error evaluating the log probability at the initial value.
Chain 1: Exception: dirichlet_lpmf: probabilities is not a valid simplex. sum(probabilities) = 2.582725846, but should be 1 (in ‘model1ccc14261d40_dirichlet_test’ at line 17)

How does it come that the Dirichlet function does not produce a valid simplex?

Hi,
changing the

parameters {
  vector[K] theta;
}

to

parameters {
  simplex[K] theta;
}

helped.
Thanks again!

1 Like