Parameter Conditionally Existent on another Parameters

Hello everyone,

In essence, I am trying to make a massive model comparison job tractable.

I have a model “myModel” that I want to describe some data y. I want the model to behave in a piecewise fashion, changing its behaviour between breakpoints defined in u. However, I do not know in advance how many breakpoints there are (this is part of what I am interested in, as well as their position).

Hence, I have an integer parameter n (the number of breakpoints) that can be any value between 1 and max_n. As n is the number of breakpoints the value of the breakpoints in u are only relevant if they are the first n values.

As n is an integer it has to be marginalized out, hence the sum over the log_likelihoods.

My question is whether this is a valid way of having the values of u conditionally used (based on n) or if this is going to cause havoc with sampling and convergence because the values of u at indexes lower than n are not going to have any impact on the likelihood yet they have been sampled.

data {
  int<lower=1> max_n; // Maximum possible value of n
  int<lower=1> n_data; // Number of data points
  vector[n_data] y; // Observed data
}

parameters {
  vector[max_n] u; // Independent uniform samples for the maximum n
   vector[max_n] full_vector;
}

transformed parameters {
  real log_lik[max_n]; // Log-likelihood for each possible n
  
  for (int n = 1; n <= max_n; n++) {
    
    for (int i = 1:n_max) {
      if (i <= n) {
        full_vector[i] = u[i];
      } else {
        full_vector[i] = 301; // Placeholder for infinity
      }
    }
    
    // Compute likelihood given the data and full_vector
    // Normal likelihood with mean as that is my model that takes vector u
    log_lik[n] = 0; // Reset
    for (int i = 1:n_data; i++) {
      log_lik[n] += normal_lpdf(y[i] | myModel(u), sigma); // Example likelihood
    }
  }
}

model {
  // Uniform prior on the parameters u
  u ~ uniform(0, 1);
  
 sigma ~ uniform(0, 10)

  // Marginalize over all possible n values using the likelihood
  target += log_sum_exp(log_lik);
}

Hi @DASface42 and welcome to the Stan forums. Sorry it’s taken so long to answer your first post.

I’m afraid Stan doesn’t let you do that directly by literally changing the number of parameters.

I think your best bet computationally and in terms of being easy to explain will just be fitting several models for different numbers of change points and then comparing with cross-validation using leave-one-out cross-validation (LOO).

You can do it with one model if you’re willing to define a mixture of mixtures. That is, if you think 5 is the maximum number of change points, you define a mixture model where the mixture components are the 5 change point model, the 4 change point model, the 3 change point model, etc. This is just one big mixture at the top, not a mixture of each data item. This requires a lot of compute as you need to fit all of the change point models simultaneously. That will also force you to come to grips with writing a prior that perhaps penalizes a large number of components. With this approach, the mixture proportion parameter (a simplex) will have a posterior that tells you how much each model is supported by the data.

Are the breakpoint values continuous or discrete? If they’re discrete as in the example in the Stan User’s Guide for coal mining disasters, then high numbers of break points become intractable to marginalize.

If you use too many breakpoints, they can collapse into each other if you don’t use a zero-avoiding prior on the distance between breakpoints.

In your example, that’s just target += sum(log_lik); if the log_lik are data likelihoods as the doc comment says. Also, if you give a parameter a uniform(0, 1) distribution like u, then you need to declare it with matching constraints, vector<lower=0, upper=1>[max_n] u;.