Model fails to initiate if I use more than a few observations - possible underflow issue?

I am developing a meta-analytic model for crash studies that implements a new inference procedure for count-based culpability crash studies. However - I have encountered an issue that I think may be related to a numerical underflow issue and hope someone can point me in the right direction.

The research question

Briefly put, a culpability study involves a sample of crash-involved drivers split into four groups. The split is along two dimensions: Those who were and were not measured as positive on some factor (e.g. positive blood alcohol concentration), and those who were and were not judged as “culpable” for the accident (scored by researchers who are given information on the crash - but not on the driver’s substance use). The assumption made is that the non-culpable drivers are a random sample from the road.

If the substance use has no impact on the risk of culpable accidents, the probability distribution across the four cell would be easy: Define a culpability share culp_share and a “substance-positive” share of drivers on the road (road_share), and the probabilities would be culp_share*road_share, and (1-culp_share) * road_share, etc… Assume that the average risk of accident in the substance-positive group (culp and non-culp combined) increases by a factor alpha that is multiplied into the baseline accident risk. This will increase the share of culpable, substance positive counts, reducing the probabilities in the other three cells. These cells will now have their “original” shares multiplied by a factor 1/(alpha * road_share + (1-road_share)), and the substance-positive culpable share will be 1- sum(remaining three cells).

The model draws parameter values alpha, road_share and culp_share, calculates the cell probabilities, and models the counts from actual studies as a multinomial draw with these probabilities. With multiple studies, I use parameter vectors with the same prior distribution (I’m working on a hierarchical model, but reverted to a simpler model to first figure out the error I’m encountering)

The Stan model error

To test the model I use simulated data with multiple random “studies” drawn from the same multinomial probabilities. If I give it a single study, the model estimates nicely- but usually has a couple of error messages before each chain gets started (“Rejecting initial value: Error evaluating the log probability at the initial value.”). At my work computer (windows pc) it seems to manage three or four studies at a time, but anything more and initialization fails. On my home laptop (a Macbook Pro), it manages more (once in a while it is able to take 20 studies). When it fails, I get the error info: “Initialization between (-2, 2) failed after 100 attempts.
Try specifying initial values, reducing ranges of constrained values, or reparameterizing the model.
[1] “Error in sampler$call_sampler(args_list[[i]]) : Initialization failed.””

I think the problem may be that the share of substance positive drivers on the road is very low (for cannabis, the crude rates for “non-culpable” counts across 6 studies ranges from 1.3% to 4.2%. My hunch is that low probabilities multiplied by low probabilities creates an underflow and gives cells zero probability, messing up the multinomial likelihood, and that the probability of this happening to at least one study is sufficiently high to stop the model from working.

Is there some standard “trick” for working around or avoiding this issue? I’ve thought about using some inv_logit expression for the probabilities - but I’ve never used a Jacobian adjustment for the transform and don’t know how I would then define the prior.

Model code

data {
  int<lower=0> study_count_culp;
  int<lower=0> culp_studies[study_count_culp,4]; // Study counts. column order: nonculp and thc-neg, nonculp and thc-pos, culp and thc-neg, culp and thc pos
}
parameters {
  vector<lower=0, upper=1>[study_count_culp] road_share;
  vector<lower=0>[study_count_culp] crash_risk;
  vector<lower=0, upper=1>[study_count_culp] culp_share;
}
transformed parameters {
  real temp_denom; //denominator for adjusting cell probabilities
  simplex[4] culp_study_shares[study_count_culp];
  
    // Calculate cell-probabilities for culpability crash studies.
  for (i in 1:study_count_culp){
    temp_denom = (1-road_share[i]*(1-crash_risk[i]));
    
    culp_study_shares[i,1] = (1-culp_share[i])*(1-road_share[i])/temp_denom;
    culp_study_shares[i,2] = (1-culp_share[i])*(road_share[i])/temp_denom;
    culp_study_shares[i,3] = (culp_share[i])*(1-road_share[i])/temp_denom;
    culp_study_shares[i,4] = 1-sum(culp_study_shares[i,1:3]);
  }

}
model {
  // Set the priors

  road_share ~ normal(0.03,0.02);
  crash_risk ~ normal(1.4,0.5);
  culp_share ~normal(0.6, 0.2);

  for (i in 1:study_count_culp){
    culp_studies[i,] ~ multinomial(culp_study_shares[i]);
  }
  
}

Simulated data

Repeated samplings from the same multinomial distribution, with probabilities based on alpha = 1.4, road_share = 0.03 and culp_share 0.6. For now, these are the same values as the mean of the priors. I’ve included 50 “studies” (each study is one line with four counts). I’ve included the matrix as a csv file - just choose the number of lines to pass into matrix form and feed into the model as data.

Set-up
Using the latest version of Stan and R - updated a couple of days ago. Issue appears on both Mac and Windows (but Mac handles more studies before failing).

sim_data.csv (691 Bytes)

If the problem is underflow, you could try moving things to the log scale.

for instance, instead of

temp_denom = (1-road_share[i]*(1-crash_risk[i]));
...
culp_study_shares[i,1] = (1-culp_share[i])*(1-road_share[i])/temp_denom;
...
```
you could write
``` 
log_temp_denom = log1m_exp( log_road_share[i] + log1m_exp(log_crash_risk[i]) );
...
log_culp_study_shares[i,1] = log1m_exp( log_culp_share[i]) + log1m_exp( log_road_share[i]) -  log_temp_denom;
...
```

the priors for log_road_share, log_crash_risk, and log_culp_share would then of course need an upper bound of zero and in the model section it would need to be 
```
culp_studies[i,] ~ multinomial(exp(log_culp_study_shares[i]));
```

Also, my understanding is that in meta-analytic models the error variances are set to the variance of each studies (effect size) estimate, to unsure that larger studies have a stronger influence.

Guido
1 Like

Thanks for the response!

By following this advice I got rid of the “Rejecting initial value” problem. Note, though, that the denominator can’t be “logged” the way suggested, as the (1-crash_risk[i]) will be negative whenever the substance increases crash risk. However: the point of the denominator was just to normalize the sum of the cell-probabilities to 1 - and this is easily achieved by using logged values for the numerators and passing the softmax of the vector with numerators into the multinomial. This also means I needed to calculate the numerator of the last cell directly - to make that work with log-values I imposed a constraint that the crash_risk>1, ensuring that (crash_risk[i]-1) will be positive.

To avoid having to do Jacobian adjustments and figure out priors for probabilities on the log scale, I continue to define the prior as before and use logged values in the transformed parameters block as follows:

transformed parameters {

vector<upper=0>[study_count_culp] log_road_share;
vector<upper=0>[study_count_culp] log_culp_share;

vector[4] culp_study_shares_log[study_count_culp];
simplex[4] culp_study_shares[study_count_culp];

log_road_share = log(road_share);
log_culp_share = log(culp_share);

  // Calculate cell-probabilities for culpability crash studies.

for (i in 1:study_count_culp){
culp_study_shares_log[i,1] = log1m_exp(log_culp_share[i]) + log1m_exp(log_road_share[i]);

culp_study_shares_log[i,2] = log1m_exp(log_culp_share[i]) + log_road_share[i];

culp_study_shares_log[i,3] = log_culp_share[i] + log1m_exp(log_road_share[i]);

culp_study_shares_log[i,4] = log_sum_exp(log_culp_share[i] + log_road_share[i],
log_road_share[i]+ log(crash_risk[i]-1));

culp_study_shares[i] = softmax(culp_study_shares_log[i]);

}
}

Ole