Rootfinding of inverse-CDF of Gaussian mixture with Newton solver

I have a series of sample quantiles with known probabilities, and I know the quantiles were calculated from a sample from a Gaussian mixture distribution. I assume the quantiles follow a normal distribution where the inverse-CDF of the Gaussian mixture is the mean and the variance covariance matrix is also a function of the inverse-CDF. I’m trying to use a Newton solver to calculate the inverse-CDF. I get the follow error, however.

Chain 1 Unrecoverable error evaluating the log probability at the initial value. 
Chain 1 Exception: The linear solver’s setup function failed in an unrecoverable manner. (in 'C:/Users/SPENCE~1/AppData/Local/Temp/RtmpiWuE4p/model-5c68512547fb.stan', line 62, column 4 to column 69) 

I think there could be a number of issues, including not using the correct datatypes, but I’m unable to find useful examples of how to do it properly. My STAN program is below, as well as some R code that can generate the data I’m trying to fit.

  vector GMInv_CDF(vector Q, real p, vector pi, vector mu, 
                   vector sigma) {
    vector[1] z;
    real cdf;
    // this encodes the cdf from which we want to sample
    cdf = pi[1]*normal_cdf(Q[1] | mu[1], sigma[1]) +
          pi[2]*normal_cdf(Q[1] | mu[2], sigma[2]);
    z[1] = cdf - p;
    return z;
  real GM_PDF(vector mu, vector sigma, vector pi, vector y) {
    real density;
    density = pi[1]*exp(normal_lpdf(y[1] | mu[1], sigma[1])) + 
              pi[2]*exp(normal_lpdf(y[1] | mu[2], sigma[2]));
    return density;

data {
  int<lower=0> N;
  row_vector[N] Q;
  vector<lower=0, upper=1>[N] p;
  int<lower=1> n_components;
  real m;
  real<lower=0> c; 
  real<lower=0> sv; // sd prior sd
  real<lower=0> nv; // n prior sd
  vector<lower=0>[n_components] alpha;

parameters {
  vector[n_components] mus;
  vector<lower=0>[n_components] sigmas;
  real<lower=0> n;
  vector<lower=0,upper=1>[n_components] pi;

transformed parameters {
  vector[1] Qi[N];
  matrix<lower=0>[N, N] q_var;
  vector[n_components] pit;

  pit = softmax(pi);
  for (i in 1:N) {
    vector[1] Q_guess = [Q[i]]';
    Qi[i] = solve_newton(GMInv_CDF, Q_guess, p[i], pit, mus, sigmas);
  for (i in 1:N) {
    for (j in 1:i) {
      q_var[i, j] = (fmin(p[i],p[j])*(1 - fmax(p[i],p[j]))) / 
                    (n*GM_PDF(mus, sigmas, pit, Qi[i])*
                     GM_PDF(mus, sigmas, pit, Qi[j]));
      q_var[j, i] = q_var[i, j];


model {
  pi ~ normal(4,5);
  mus ~ normal(5,4);
  sigmas ~ normal(0,4);
  n ~ normal(0, nv);

  Q ~ multi_normal(Qi, q_var);

#sample size
n <- 400
#make mixture distribution
mdist <- UnivarMixingDistribution(Norm(1, 1.1), 
                                  Norm(10.5, 1.1),
                                  mixCoeff = c(.4, .6))

#sample from mixture distribution
y <- r(mdist)(n)

#known quantile probabilities
probs <- c(0.01, 0.025, seq(0.05, 0.95, by = 0.05), 0.975, 0.99)

#calculate quantiles from sample for given probabilities. this will be the "data" to model
quantiles <- quantile(y, probs = probs)

#data list for STAN program
dat <- list(
  N = length(quantiles),
  Q = quantiles,
  p = probs,
  n_components = 2,
  m = 2,
  c = 3,
  sv = 3,
  tv = 1,
  nv = 3000,
  alpha = rep(.1,2)

I’m not sure what’s going on here, but the error’s definitely from the solver and it’s causing things not to initialize. I’m pinging @charlesm93, who might be able to help.

Were you able to get any kind of root finding working? I would always recommend working up from simpler cases if possible.

Softmax overall can be problematic in terms of identifiability (the input is only identified up to an additive constant) if you don’t do something like pin one of the values to 0 or enforce a sum-to-zero constraint on all of the entries.

P.S. Is there something on Discourse inserting vertical space into people’s program? If not, I don’t. understand why everyone using so much space. Also, there’s a lot of redundancy here, where you can just do this:

vector GMInv_CDF(vector Q, real p, vector pi, vector mu, 
                   vector sigma) {
  return [pi[1] * normal_cdf(Q[1] | mu[1], sigma[1])
          + pi[2] * normal_cdf(Q[1] | mu[2], sigma[2])
          - p]';
real GM_PDF(vector mu, vector sigma, vector pi, vector y) {
  return pi[1] * exp(normal_lpdf(y[1] | mu[1], sigma[1]))
      + pi[2] * exp(normal_lpdf(y[1] | mu[2], sigma[2]));

If you want to add intermediate assignments, make them one-liners that declare and define on one line.

You can also help scale parameters to be standard normal on the unconstrained scale by using the location and scale of priors as offset and multiplier:

vector<offset=5, multiplier=4>[n_components] mus;

to match

mus ~ normal(5, 4);

That way, if mus really does have something like a normal(5, 4) distribution in the posterior, it’ll be scaled back to standard normal with this offset and multiplier.

1 Like

Looks like an error message from KINSOL, which implements the Newton solver Stan uses. There are a few things that might be going on, so let’s try to rule some of them out.

  1. One of the function’s input is not properly initialized. Here, I recommend running the Markov chain for a single iteration and adding a print statement inside of GMInv_CDF to see if the function gets evaluated or returns NaN.

  2. The algebraic equation is singular and cannot be solved. It could be good to see if you reproduce the error using another solver, for example, the powell solver, which you can call using solve_powell. This solver uses a different algorithm and doesn’t rely on KINSOL. You can also write the equation in R and solve it with a rootfinder, for some fixed parameter values. If the solver crashes across all three solver, then the problem may be ill-posed.

I’d start here, and based on what you find out, we can try other things.

1 Like