Errors thrown with von mises distribution

Hi all,

I’m trying to fit a slightly fiddly regression model, and am coming up against a couple of error messages as described below. Briefly, the data I’m trying to model is the angular distance between the true orientation of an arrow and the orientation recalled by subjects from memory. This error should be roughly von mises distributed, centered on 0, and I’m interested in whether the precision changes across the different experimental conditions; so, I’ve put a linear model on kappa, containing an average effect of each condition (alpha) plus varying effects for each subject in each condition (beta). The link is a softplus function (log(1 + exp(x)).

When I try to fit this I get two error messages, each printed lots of times:

Informational Message: The current Metropolis proposal is about to be rejected because of the following issue: Exception: Error in function boost::math::cyl_bessel_i<double>(double,double): numeric overflow (in 'C:/Users/hfleming/AppData/Local/Temp/RtmpCij6ml/model-6f804b2043c7.stan', line 81, column 4 to column 64)

and

Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
Exception: kappa_regression_model_namespace::log_prob: kappa[sym1__, sym2__] is nan, but must be greater than or equal to 0 (in 'C:/Users/hfleming/AppData/Local/Temp/RtmpCij6ml/model-6f804b2043c7.stan', line 54, column 2 to column 39)

I don’t really understand why I’m getting these messages - there’s obviously something wrong with the way it’s sampling kappa, but as far as I can tell, all the constraints are there that should be there and all the priors are proper. Fiddling with the priors doesn’t seem to help, nor does specifying initial values in cmdstan. If I leave the model for long enough eventually it starts sampling with no errors from that point on, but I still feel very uneasy about trusting the model fit if I don’t understand the errors!

Would anyone be able to explain what’s causing these messages, and what I can do to fix this? Happy to add some sample data if it helps.

Thanks very much in advance,
Hugo

functions {
 
  /* Scale_r_cor
  * (from brms)
   * compute correlated group-level effects
   * Args: 
   *   z: matrix of unscaled group-level effects
   *   SD: vector of standard deviation parameters
   *   L: cholesky factor correlation matrix
   * Returns: 
   *   matrix of scaled group-level effects
   */ 
  matrix scale_r_cor(matrix z, vector SD, matrix L) {
    // returns effects in another dimension order than z
    return transpose(diag_pre_multiply(SD, L) * z);
  }

}
  
data {
  int<lower=1> N; //Number of trials
  int<lower=1> Nsubs; //Number of subjects
  int<lower=1, upper=6> Ntreats; //Number of treatments to model
  int<lower=1, upper=Nsubs> subjn[N]; //Index of subject numbers on each trial
  int<lower=1, upper=6> treatment[N]; //Index of treatment on each trial
  vector<lower=-pi(), upper=pi()>[N] y; //Outcome to model (response errors)
}


parameters {
  real<lower=-pi(),upper=pi()> I_bar; //population of mean error from which subs are drawn
  real<lower=0> I_kappa; //population of mean error from which subs are drawn

  vector<lower=-pi(),upper=pi()>[Nsubs] I;
  
  vector[Ntreats] alpha; //average treatment effect (intercept)

  matrix[Ntreats,Nsubs] z_sub; //standardised group-level effects
  vector<lower=0>[Ntreats] sigma_sub; //sd of each treatment effect across subjects
  cholesky_factor_corr[Ntreats] L; //cholesky factor of correlation matrix
}

transformed parameters {
  matrix[Nsubs,Ntreats] beta; //actual group level effects (deviation from average effect)
  matrix<lower=0>[Nsubs,Ntreats] kappa;

  //recentre the matrix of correlated effects
  beta = scale_r_cor(z_sub, sigma_sub, L); 

  for (sub in 1:Nsubs){
    for (treat in 1:Ntreats){
      kappa[sub,treat] = log1p_exp(alpha[treat] + beta[sub,treat]);
    }
  }

}

model {
  //Mean error part
  I_bar ~ von_mises(0,0.5);
  I_kappa ~ exponential(1);
  I ~ von_mises(I_bar,I_kappa);
  
  // Precision part
  alpha ~ normal(7,5);
  to_vector(z_sub) ~ std_normal();
  sigma_sub ~ exponential(1);
  L ~ lkj_corr_cholesky(2);
  
  // Data
  for (n in 1:N){
    y[n] ~ von_mises(I[subjn[n]], kappa[subjn[n],treatment[n]]);
  }
}

generated quantities {
  
  vector[N] log_lik;

  for (n in 1:N){
    log_lik[n] = von_mises_lpdf(y[n] | I[subjn[n]], kappa[subjn[n],treatment[n]]);
  }
  
}

1 Like

Yeah, that kappa parameter blows up boost when it’s value is too high; when I use it I switch to the normal with an sd=sqrt(1/kappa) when encountering kappas>10

Hi Mike, thanks, yes that’s fixed the first message. I’m still getting the message about kappa being nan though - any thoughts on what that might be? Potentially over/underflow through the log1p_exp() link function?

Very possibly; what’s the motivation for that link? I’ve always used just a log-link for kappa in the past.

Oops, I mean that I’ve always done kappa = exp(...)

Yeah I had a look at that one but wasn’t particularly happy with the long right tail it implies for the prior on kappa - possibly it’ll just get swamped by the data anyway, but I saw that the softplus was an option offered in brms so thought I would give it a go.

But I think you have things backwards. You want kappa to never be a negative value, but log1p_exp() is computing a log at the end, which can yield negative values.

Sorry, I can’t math apparently. the 1+ part of log1p_exp. does indeed ensure positive.

(well, combined with the exp part)

Was just double checking myself! But yes, the attraction was that it’s basically linear for x > 5 or so, but it does look a bit funky when x gets closer to 0.

What I think I may do is try with the log link and see if that fixes the remaining error message - will post back if that does indeed help.

Hi @mike-lawrence (or indeed anyone else that might be able to help!) - thanks very much for your advice yesterday. I’ve now tried running the model with both the log link instead of the softplus, and I’ve also tried really narrowing the implied prior over kappa so that essentially all of the probability density is < 10, but unfortunately still no luck.

Interestingly I do seem to have narrowed down the problem though - as recommended I added the option to switch to a normal distribution if kappa > 10, and I then started getting messages about the scale parameter for the normal being 0 when it should be positive, suggesting the underlying value of kappa was very large. Similarly, I’ve tried putting an improper upper bound on kappa and I then get error messages saying the proposal for kappa was around 200 or even more.

The thing is, I just don’t understand why it’s trying to sample such large values of kappa: although (without the upper bound) such large values of kappa are theoretically possible, they should be vanishingly unlikely with my priors - in prior predictions I can do 30000 draws and the largest value of kappa I get is 35.

Is there something I’m missing? Although with the above model eventually it does seem to sample normally once it gets past these initial errors, with a slightly expanded model I’m trying it seems to get stuck and can’t get past the errors at the beginning.

1 Like

This looks like the prior conflicts with the data - if you have enough data, it will overwhelm even the narrowest of priors. So the thing to investigate is why the data “wants” be there so badly. I unfortunately don’t understand the model very well, but some strategies to look at:

  • What happens with a subset of the data? What if you simplify the model further (e.g. just one subject, just one treatment)
  • von Mises has issues if the mean is near the boundary where the mapping fails (i.e. in your case -pi() should be a neighbour to pi() - epsilon, but on the unconstrained scale they are infinitely far). Are you sure the model is not trying to predict across the boundary?

Best of luck with the model!

1 Like