Why is one parameter not sampling at all?

badmax · February 7, 2020, 3:45am

I have the model below, which I run with 4 chains. It’s a logit-normal model with normal priors for the mean and uniform for the variance. Even after 4000 iterations with 1000 warmups the chains don’t converge at all. First I get an error message saying

Log probability evaluates to log(0), i.e. negative infinity.

Then I noticed in the diagnostic plots that while the beta_z parameters somewhat move around, y_sigma remains fixed in every sample in every chain. It’s constant at 5998. Finally I get the error that

The largest R-hat is Inf, indicating chains have not mixed.

and that every transition is divergent.

What is wrong?

data {
   int<lower = 1> N;
   int<lower = 1> M;
   
   vector[N] y;
   matrix[N, M] X;
   
   cholesky_factor_cov[M, M] prior_L;
   vector[M] prior_betas;
   
   int<lower = 1> test_N;
   matrix[test_N, M] test_X;
  }
  
  parameters {
   vector[M] beta_z;
   real<lower = 0> y_sigma;
  }
  
  model {
   vector[M] beta = prior_betas + prior_L * beta_z;
    
    beta_z  ~ normal(0, 1);
    y_sigma ~ uniform(0.5, 2.5);
    
    for (n in 1:N) {
      target += -pow((logit(y[n]) - (row(X, n) * beta)) / (sqrt2() * pi()), 2.0);
      target += -log(y[n] * (1 - y[n]));
      target += -log(y_sigma * sqrt(2 * pi()));
    }
  }
  
  generated quantities {
    vector[N] pred;
    vector[test_N] test_pred;
    
    vector[M] beta = prior_betas + prior_L * beta_z;
    
    for (n in 1:N) {
      pred[n] = inv_logit(normal_rng(row(X, n) * beta, y_sigma));
    }
    
    for (n in 1:test_N) {
      test_pred[n] = inv_logit(normal_rng(row(test_X, n) * beta, y_sigma));
    }
  }

FJCC · February 7, 2020, 4:22am

I am just a novice user, certainly not an expert, so I may not be entirely correct. A problem I see is that you have a lower bound of 0 and no upper bound on y_sigma but the prior you put on it has support only over the range [0.5,2.5]. Because the variable has a lower bound, it gets log transformed (section 10.2 of the reference manual version 2.21) to have an unconstrained range during fitting and on that scale it gets initialized on the range [-2,2], so [e^{-2},e^2]. The initial values can fall where the probability is zero for the prior and bad things happen, or the sampling can go to a region with zero probability. Try putting a prior on y_sigma that goes down to zero and does not have a hard upper limit. There are recommendations published for priors on scale parameters, though I cannot remember just where that is at the moment.

Edit: One source of recommendations on priors

Topic		Replies	Views
Sampled parameter dropping at the last iteration Modeling	1	398	October 5, 2018
Model gets stuck - but not quite always Modeling	7	1355	July 18, 2018
Warning : Chains in stan are not mixed Modeling rstan , fitting-issues , specification	2	558	August 16, 2020
Non-convergent iterations between convergent iterations Modeling	8	715	October 23, 2019
Divergent transitions in hierarchical model Modeling fitting-issues	26	1908	November 7, 2019

Why is one parameter not sampling at all?

Related topics