Beta regression - help with error evaluating the log probability at the initial value

JimBob · May 1, 2020, 2:47pm

Hi everyone,
I’ve had some trouble fitting a beta regression model, with an error relating to the initial value. Notably, using these same priors in similar data I haven’t had any issues, though those models involved fewer predictors. This model did eventually compute, but starts out by throwing an initialisation error. Several times it just failed to initialise at all and just didn’t sample. The error/warning message was:

Chain 4: Rejecting initial value:
Chain 4: Error evaluating the log probability at the initial value.
Chain 4: Exception: beta_lpdf: Second shape parameter[17] is 0, but must be > 0! (in ‘model20a4383764c7_3cc7b493a3f2d2c6fdea015d730f6238’ at line 58)

This offending line would appear to be:

target += beta_lpdf(Y | mu * phi, (1 - mu) * phi)

Is there are way to solve this initialisation issue? Clearly from the message, the issue is something to do with stopping the model from trying to initialise at 0, but I am wondering what to change in order that it avoids that. I am using brms rather than stan directly, so if there is a way this can be solved within brms then that would also be great!

As my first thought, I wonder if it is to do with this line regarding phi, which seems to put a lower bound of 0 on it, when phi should be positive and likely is also the shape parameter referred to in the warning:

real<lower=0> phi

However, as I usually use brms, I don’t really know exactly how to specify things in stan itself.

// generated with brms 2.12.0
functions {
}
data {
  int<lower=1> N;  // number of observations
  vector[N] Y;  // response variable
  int<lower=1> K;  // number of population-level effects
  matrix[N, K] X;  // population-level design matrix
  // data for group-level effects of ID 1
  int<lower=1> N_1;  // number of grouping levels
  int<lower=1> M_1;  // number of coefficients per level
  int<lower=1> J_1[N];  // grouping indicator per observation
  // group-level predictor values
  vector[N] Z_1_1;
  int prior_only;  // should the likelihood be ignored?
}
transformed data {
}
parameters {
  vector[K] b;  // population-level effects
  real<lower=0> phi;  // precision parameter
  vector<lower=0>[M_1] sd_1;  // group-level standard deviations
  vector[N_1] z_1[M_1];  // standardized group-level effects
}
transformed parameters {
  vector[N_1] r_1_1;  // actual group-level effects
  r_1_1 = (sd_1[1] * (z_1[1]));
}
model {
  // initialize linear predictor term
  vector[N] mu = X * b;
  for (n in 1:N) {
    // add more terms to the linear predictor
    mu[n] += r_1_1[J_1[n]] * Z_1_1[n];
  }
  for (n in 1:N) {
    // apply the inverse link function
    mu[n] = inv_logit(mu[n]);
  }
  // priors including all constants
  target += normal_lpdf(b[1] | 0 , 1);
  target += normal_lpdf(b[2] | 0 , 1);
  target += normal_lpdf(b[3] | 0 , 1);
  target += normal_lpdf(b[4] | 0 , .125);
  target += normal_lpdf(b[5] | 0 , .125);
  target += normal_lpdf(b[6] | 0 , .05);
  target += normal_lpdf(b[7] | 0 , .05);
  target += normal_lpdf(b[8] | 0 , .05);
  target += normal_lpdf(b[9] | 0 , .05);
  target += normal_lpdf(b[10] | 0 , .05);
  target += normal_lpdf(b[11] | 0 , .05);
  target += gamma_lpdf(phi | 0.01, 0.01);
  target += student_t_lpdf(sd_1 | 3, 0, 10)
    - 1 * student_t_lccdf(0 | 3, 0, 10);
  target += normal_lpdf(z_1[1] | 0, 1);
  // likelihood including all constants
  if (!prior_only) {
    target += beta_lpdf(Y | mu * phi, (1 - mu) * phi);
  }
}
generated quantities {
}

Any help would be much appreciated!

As one further note, I saw some messages in other people’s posts about similar issues indicating that their priors were not very good or ‘non-differentiable’ - is there a way to know if your priors are non-differentiable?!

FJCC · May 1, 2020, 7:31pm

Since the error mentions the second shape parameter, my guess is that (1 - mu) is zero. It is not clear whether X has been scaled to have a mean of zero and sd = 1. From the few times I have used BRMS, I thought that was the standard behavior but I could be wrong. The initial values of b will be in the range [-2, 2]. If some values in X are large enough, inv_logit(mu[n]) could be 1 and 1 - mu would then be zero.

I am just a novice with Stan, so treat my suggestions with caution.

andre.pfeuffer · May 2, 2020, 9:14am

You can try using the beta proportion distribution, 19.2 Beta Proportion Distribution | Stan Functions Reference

or as a quick fix use:

target += beta_lpdf(Y | mu * phi + 1.0e-9, (1 - mu) * phi + 1.0e-9)

You may also specify initial values to phi, thanks to @FJCC.

JimBob · May 3, 2020, 1:18pm

Thanks for your responses both of you - I will try these out tomorrow and see how it goes! Regarding your quick fix suggestion, I see the point being to just make sure these don’t evaluate to 0. Could you actually interpret the output having used this fix - as in, would the results be reliable because all you’ve done is add a tiny amount to get the iterations going, or is it just a quick way to get the estimation running that you couldn’t reliably interpret?

JimBob · May 4, 2020, 5:49pm

@andre.pfeuffer and @FJCC thanks for your suggestions. I actually think I may have managed to avoid the issues another way. I realised that for several of the predictors, the values stan would be trying to estimate would always be very low and need to be assessed at a level beyond 2 decimal places to be interpretable. This may be completely wrong but it seemed that maybe this might cause some difficulties in estimation if the parameters had to be very small and precise. In any case, I just divided the predictors by a constant so that their estimated effects would be larger. I then found that running the regression, this issue simply does not come up and all the chains run very fast, and I also get a larger nEff.

GRH15 · April 22, 2025, 11:12pm

Hi, I hope this message finds you well. A follow-up question after such a loooooong time. I have also encountered this kind of error message when trying to fit a Bayesian Beta Mixed Model using brms. I have checked that the model seems to converge: all R hats values equal to 1 and the ESS looks reasonable. But I’m not sure if this kind of error message (in my case: Chain 1 Rejecting initial value:
Chain 1 Error evaluating the log probability at the initial value.
Chain 1 Exception: beta_lpdf: Second shape parameter[2375] is 0, but must be positive finite! (in ‘/var/folders/w5/pw96qqsx30n74l9nyqc48m080000gn/T/RtmpSwZA6H/model-629d8300ba3.stan’, line 155, column 4 to column 56)) still make the model estimate unreliable. I’m still naive to Bayesian analysis so would really appreciate any thoughts and suggestions.

Bob_Carpenter · April 24, 2025, 8:06pm

That’s not uncommon.

I would suggest declaring this as

vector<lower=0, upper=1> Y;

and it will catch any values that are outside of [0, 1]. You can run into problems if you have a value on the boundary—I’m pretty sure Stan will reject those.

@andrewgelman wrote a paper on variance priors that explains in section 4.3 how this convention from BUGS can lead to unexpected behavior because it winds up exerting a pretty strong pull away from zero rather than being “uninformative” like people think it will be. In the first hierarchical model in the fifth chapter of Bayesian Data Analysis, Gelman et al. recommend a Pareto prior (though he doesn’t call it that or note that it needs a lower bound > 0 to be proper—he just derives it through a change of variables) for the total concentration phi for a beta distribution and has some example plots showing why.

Some code tips

This can be a one-liner and so can the next few things.

 vector[N_1] r_1_1 = sd_1[1] * z_1[1];

vector[N] mu = inv_logit(X * b + r_1_1[J_1] .* Z_1_1);  // .* is elementwise multiply

If you define

transformed data {
  vector[11] sigma_b = [1, 1, 1, .125, .125, 0.05, ..., 0.05];  // fill in ... with values
}

then you can just write a one-liner.

b ~ normal(0, sigma_b);

Or if for some reason you want the normalizing constants that aren’t needed for sampling,

target += normal_lpdf(b | 0, sigma_b);

Topic		Replies	Views
Rejecting initial value brms	12	5757	February 1, 2019
Beta binomial regression rejecting initial values brms fitting-issues , brms	5	126	April 16, 2025
Beta regression,Rejecting initial value,Error evaluating the log probability at the initial value Modeling fitting-issues	7	1435	April 16, 2019
Initial value problem Modeling	2	1776	January 27, 2018
Error evaluating the log probability at the initial value brms	2	1439	November 15, 2019

Beta regression - help with error evaluating the log probability at the initial value

Some code tips

Related topics