Parameters during warmup far outside prior range


#1

Dear all,

I’ve been trying to track down the following error message:
“Exception: bernoulli_logit_lpmf: Logit transformed probability parameter[2815] is nan, but must not be nan! for a model whose code is shown in the end of the post (because maybe not relevant to the question).”

To do this, I put a print function in the code that is executed once per stan iteration through the model. For the parameter that I’m monitoring (rewMagDist with 5 entries), I’ve put a prior of normal(1,0.7) and lower=0. From the print command I get for the first few steps leading up to output-1.csv: Iteration: 1 / 20 [ 5%] (Warmup):
output-1.csv: rewMagDist [0.391818,6.87112,1.86618,4.96533,5.44635]
output-1.csv: rewMagDist [48.5049,0,inf,0,0]
output-1.csv: rewMagDist [0.391818,6.87112,1.86618,4.96533,5.44635]
output-1.csv: rewMagDist [34.2568,0,inf,0,0]
output-1.csv: rewMagDist [0.391818,6.87112,1.86618,4.96533,5.44635]
output-1.csv: rewMagDist [0.978113,0,inf,0,0]
output-1.csv: rewMagDist [0.391818,6.87112,1.86618,4.96533,5.44635]
output-1.csv: rewMagDist [0.662151,0,inf,0,0]
output-1.csv: rewMagDist [0.391818,6.87112,1.86618,4.96533,5.44635]
output-1.csv: rewMagDist [0.439123,0,inf,0,0]
output-1.csv: rewMagDist [0.391818,6.87112,1.86618,4.96533,5.44635]
output-1.csv: rewMagDist [0.366358,0,inf,0,0]
output-1.csv: rewMagDist [0.391818,6.87112,1.86618,4.96533,5.44635]
output-1.csv: rewMagDist [0.383564,0,inf,0,0]
output-1.csv: rewMagDist [0.391818,6.87112,1.86618,4.96533,5.44635]
output-1.csv: rewMagDist [0.388283,0,8.15364e+162,0,0]
output-1.csv: rewMagDist [0.391818,6.87112,1.86618,4.96533,5.44635]
output-1.csv: rewMagDist [0.390574,0,8.43455e+40,0,0]

I’ve noticed that some of the monitored values are inf or very large (8.43455e+40). Is this normal given that I’ve put a prior of normal(1,0.7)? Or does it suggest that something is wrong with my model?
[When I put in my model to only print if the evaluating to nan occurs (i.e. if invTxUtil, see model code is nan), it was always when parameters were very large.]

Many thanks
Jacquie

In case this is useful, here is the model code for a prospect theory model with ~70 human participants with ~1000 trials per participant. I did not manage to get this to finish sampling (I gave up after 10h and sampling still being at 20%); one way I found to make this run faster is to impose hard boundaries on rewMagDist and lossMagDist of <lower=0,upper=2>, which improves the problem I think because these parameters are in an exponent.

data{
  int ntr;          // number of total data points
  int nsub;         // total number of participants
  int subInd[ntr];
  int rewM[ntr,2];  // reward magnitude
  int lossM[ntr,2]; // loss magnitude
  real prob[ntr,2]; // reward probability
  int choice[ntr];  // choice on each trial (1 for left, 0 for right)
}

parameters{
  // individual subject level
  real<lower=0> invTemp[nsub];     // inverse temperature
  real<lower=0> probDist[nsub];    // probability distortion
  real<lower=0> rewMagDist[nsub];  // reward magnitude distortion
  real<lower=0> lossMagDist[nsub]; // loss magnitude distortion
}

model{
  // transformed reward and loss probabilities and magnitudes
  real rewProbTr[ntr,2];
  real lossProbTr[ntr,2];
  real rewMagTr[ntr,2];
  real lossMagTr[ntr,2];
  real utilities[ntr,2];
  // some short-cuts
  real g;
  real invTxUtil[ntr];

  // Priors
  invTemp     ~ normal(0,1); // it's a very small number
  // if there is no bias: value=1, therefore centre prior on 1.
  probDist    ~ normal(1,0.7);
  rewMagDist  ~ normal(1,0.7); // values get crazy quickly, so only allow small range
  lossMagDist ~ normal(1,0.7);

  // Going through each trial (because Stan can only do exponentiation on single numbers)
  for (it in 1:ntr){
    for (io in 1:2){
      // Transform probabilities
      g = probDist[subInd[it]];
      rewProbTr[it,io]  = prob[it,io]^g/((prob[it,io]^g+(1-prob[it,io])^g)^(1/g));
      lossProbTr[it,io] = 1- rewProbTr[it,io];
      // Transform reward magnitudes
      rewMagTr[it,io]  = rewM[it,io]^rewMagDist[subInd[it]];
      lossMagTr[it,io] = lossM[it,io]^lossMagDist[subInd[it]];
      // Compute utility
      utilities[it,io] = rewProbTr[it,io]*rewMagTr[it,io] - lossProbTr[it,io]*lossMagTr[it,io];
    }
    // combine utilities and choice stochasticity
    invTxUtil[it] = invTemp[subInd[it]]*(utilities[it,1]-utilities[it,2]); 
  }
  print("rewMagDist ", rewMagDist);
  // link utilities to actual choices
  choice ~ bernoulli_logit(invTxUtil);
}
~~~~

#2

So the parameters by default are initialized -2 to 2 on the untransformed space. The initialization really doesn’t take into account the priors, so the very first numbers are just totally random (priors aren’t special in any way in Stan – just more terms for the log density). Since you’ve got the > 0 constraint in there, the initial parameters should land somewhere between exp(-2) and exp(2) (not uniformly – but somewhere).

Yeah, there’s something wrong probly. What the printout you included seems to indicate is that every time the sampler tries to take a step, it moves from its initial condition ([0.391818,6.87112,1.86618,4.96533,5.44635]) to something totally crazy ([0.390574,0,8.43455e+40,0,0]).

I don’t quite follow the model, but you’ve got the right idea for debugging. Just do more printing. Make sure and verify the data is what you think it is. For instance, that prob is actually a matrix of probabilities (and doesn’t have values less than zero or greater than one or whatever).