Error initializing random slopes, but only for binomial model

Hi, I am using rstan and running into an Initialization failure that I cannot understand. I am trying to fit a model to some grouped count data that depend on age.

The data in R is the following

meanAge <- c(4.5, 14.5, 24.5, 34.5, 44.5, 54.5, 64.5, 74.5, 84.5, 90.0,
             38.5, 64.5, 74.5, 85.0, 10.0, 24.5, 34.5, 44.5, 54.5, 64.5,
             74.5, 85.0 , 4.5, 14.5, 24.5, 34.5, 44.5, 54.5, 64.5, 74.5,
             84.5, 90.0 , 4.5, 14.5, 24.5, 34.5, 44.5, 54.5, 64.5, 74.5,
             85.0, 2.0, 9.5, 19.5, 29.5, 39.5, 49.5, 59.5, 69.5, 79.5)

studyNum <- c(4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 2, 2, 2, 2, 5, 5, 5, 5, 5, 5, 5,
              5, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 6, 6, 6, 6, 6, 6, 6, 6, 6,
              1, 1, 1, 1, 1, 1, 1, 1, 1)

Deaths <- c(0, 0, 4, 8, 22, 53, 84, 145, 170, 67, 134, 241, 572,
            1280, 5, 24, 89, 208, 632, 1542, 2844, 4857, 0, 0,
            0, 0, 0, 149, 504, 1656, 2698, 1162, 2, 6, 33, 73,
            277, 922, 2515, 6436, 17474, 0, 0, 1, 5, 12, 24, 67, 225, 514)

Hosp <- c(26, 8, 97, 211, 352, 515 , 533, 451, 313, 128, 2896, 1621,
          2158, 3346, 789, 2149, 4660, 6303, 9724, 12236, 13543, 15370, 54, 39,
          180, 351, 862, 2072, 2803, 4037, 3816, 1314, 313, 339, 1911, 4722,
          10353, 16553, 19712, 23243, 29517, 24, 18, 77, 198, 274, 451, 499,
          587, 746)

letalityData <- list(N=length(studyNum), K=length(unique(studyNum)),
                          group=studyNum, ageVec=meanAge, successes=Deaths,
                          totalCount=Hosp, propVec=Deaths/Hosp))


I want to model the number of deaths among hospitalized people as a binomial process, using a generalized linear model where age is the predictor. I also want to allow the intercept and slope to vary among studies.

data {
  int<lower=0> N;                     // number of observations
  int<lower=0> K;                     // number of groups
  int<lower=1, upper=K> group[N];
  vector[N] ageVec;                   // predictor
  int successes[N];
  int totalCount[N];
}
parameters {
  vector[K] groupSlope; // vector with the slopes of each study
  vector[K] groupIntercept; // vector with intercept of each study
  real ageSlope; // mean slope across studies
  real intercept; // mean intercept across studies
  real<lower=0> ageSlopeSigma;
  real<lower=0> interceptSigma;
}
transformed parameters{
  vector[N] successProb; // define the probability of death
  successProb = Phi(groupIntercept[group] + groupSlope[group] .* ageVec);
}
model {
  // Priors
  intercept ~ normal(0, 2);
  interceptSigma ~ exponential(1);
  ageSlope ~ normal(0, 1);
  ageSlopeSigma ~ exponential(1);
  // Individual studies parameters
  groupIntercept ~ normal(intercept, interceptSigma);
  groupSlope ~ normal(ageSlope, ageSlopeSigma);
  // binomial model
  successes ~ binomial(totalCount, successProb);
}

When I try to fit the above model to the data, I get the following error

[1] “Error in sampler$call_sampler(args_list[[i]]) : Initialization failed.”
error occurred during calling the sampler; sampling not done

I’ve tried a lot of things to no avail. I think the most informative one is that if I try to fit directly to the proportion of deaths with a normal distribution everything works. For that I just add the propVec variable to the data, add a residualSigma to the parameters, and substitute the last Stan line of the model for:

  residualSigma ~ exponential(1);
  propVec ~ normal(successProb, residualSigma);

Also, the model works fine when I only allow the intercepts to vary among groups, but it does not work if only the slopes vary among groups. I would greatly appreciate some help, since I’m new to Stan.

Thanks

Hi, when I run your example above using cmdstan it never starts sampling since it can’t find a good initial value:

Chain 1 Initialization between (-2, 2) failed after 100 attempts. 
Chain 1  Try specifying initial values, reducing ranges of constrained values, or reparameterizing the model.
Chain 1 Initialization failed.

Thanks. I am using rstan and did not get that more detailed message. I started tinkering with the model, setting initial values, but I found that the error disappeared if I substitute the Phi link function for an inv_logit. This works for my purposes, but would be good to get a better understanding of how to debug these problems.

1 Like

An update on the problem: It got fixed when I standardized the predictor ageVec outside of Stan. The problem seems to be that the initial values were way off for the slope, due to the scale of the ageVec vector.