Chain finishing unexpectedly

Hello. I’m having problems when trying to fit the following model


Essentially I’m trying to fit a Multinomial-Dirichlet mixture to estimate the transitions probabilities of a Galton-Watson model

data {
  int<lower = 0> N; //Number of generations
  array[5,N] int<lower=0> Z; // Observed counts per generation per type
}

parameters {
  simplex[5] p1;
  simplex[5] p2;
  simplex[5] p3;
  simplex[5] p4;
  simplex[5] p5;  //Transition probabilities from type i to type j
}

model {
  //Priors
  p1 ~ dirichlet([1,1,1,1,1]);
  p2 ~ dirichlet([1,1,1,1,1]);
  p3 ~ dirichlet([1,1,1,1,1]);
  p4 ~ dirichlet([1,1,1,1,1]);
  p5 ~ dirichlet([1,1,1,1,1]);
  
  //Likelihood
  for (n in 1:N-1) {
    vector[5] probabilities_for_next_gen;
    array[5] int counts_for_next_gen;
    
    //Calculate probs for the next generation
    for (i in 1:5) {
      probabilities_for_next_gen[i] = p1[i] * Z[1,n] + p2[i] * Z[2,n] +
                                      p3[i] * Z[3,n] + p4[i] * Z[4,n] +
                                      p5[i] * Z[5,n];
      
    } 
    
    //Normalize to make them probablities
    probabilities_for_next_gen /= sum(probabilities_for_next_gen);
    
    //Setting up the counts for the multinomial distribution
    for (i in 1:5) {
      counts_for_next_gen[i] = Z[i, n+1];
    }
    
    //Multinomial distribution with observed counts and calculated probs
    counts_for_next_gen ~ multinomial(probabilities_for_next_gen);
    
  }
}

generated quantities {
  matrix[5,5] P;  //Transition probability matrix
  P[,1] = p1;
  P[,2] = p2;
  P[,3] = p3;
  P[,4] = p4;
  P[,5] = p5;
}

Then I’m fitting the model using CmdStanR, the code is the following, I’ll also include the synthetic data I’m using to test the model

#Probs de transición reales
transition_probs <- matrix(c(
  0.2, 0.2, 0.2, 0.2, 0.2,  
  0.2, 0.2, 0.2, 0.2, 0.2,  
  0.2, 0.2, 0.2, 0.2, 0.2,  
  0.2, 0.2, 0.2, 0.2, 0.2,  
  0.2, 0.2, 0.2, 0.2, 0.2  
), byrow = TRUE, nrow = 5)

#Poblaciones iniciales para cada tipo

initial_counts <- c(100,100,100,100,100)

#Generaciones
N <- 10

#Matriz de poblacion
Z <- matrix(nrow = 5, ncol = N)
Z[,1] <- initial_counts

#Generamos los datos
set.seed(182120) #Datos reproducibles
for (n in 1:(N-1)) {
  for (i in 1:5) {
    #Calculamos el total de incividuos del tipo i en la n-ésima generación
    total_individuals_type_i <- Z[i,n]
    
    #El count de cada tipo en la sig. generación es muestreado de una miltinomial
    Z[,n+1] <- rmultinom(1, size = total_individuals_type_i,
                         prob = transition_probs[i,])
  }
}
Z <- t(Z)

## Fit del modelo --------------------------------------------------------------
fit_5tipos <- mod_5types$sample(data = list(N = N, Z = t(Z)),
                                seed = 182120, chains = 1,
                                parallel_chains = 1,
                                iter_warmup = 5000,
                                iter_sampling = 10000)

The error message I’m getting is: Warning: Chain 1 finished unexpectedly!
I’m also getting the same message: Chain 1 Exception: multinomial_lpmf: Probabilities parameter is not a valid simplex

The second message makes me think it has to do something related to the numerical stability of the posterior parameters, what I mean is that the estimation process rounds the posterior transition probabilities to numerical ceros and that’s why I’m getting those error messages, but I’m not really sure.

Is there a way to fix the code so that the chain converges to a posterior distribution?

Yes, what’s going to happen is that if you keep throwing exceptions during initialization, the chain won’t be able to initialize and it won’t be able to finish. You should’ve gotten a line number with that error message, but I see there’s only one multinomial sampling line.

I would suggest trapping and printing the simplexes:

print("probs[", i, "] = ", probabilities_for_next_gen, ";  sum = ", sum(probabilities_for_next_gen);
print("counts = ", counts_for_next_gen);

Then you can see how bad things are. The way I can see this being a problem is if you have zero values in the simplex and non-zero value in the counts.

If the probability-for-next-gen is a mixture, it might help to write it that way. Otherwise, I don’t see why you’d expect these to add up to 1.

P.S. Whenever you have something like this:

I’d suggest converting to an array:

parameters {
   array[5] simplex[5] p;
}

You could write the Dirichlet priors as:

transformed data {
  vector[5] alpha = rep_vector(1, 5);
...
model {
  p ~ dirichlet(alpha);

But if you have a uniform Dirichlet like this, you can just get rid of it. It doesn’t add anything to the posterior. You can also simplify a lot of the intermediate code with vectorization. For instance, in generated quantities, you can just use

matrix[5, 5] P = [p1', p2', p3', p4', p5']';

but this’d be simpler by redefining the types.

1 Like