Hi! I am new to cmdstanr and I am trying to use this package to fit my age-stratified seroprevalence data to a simple catalytic model exploiting the beta_binomial_lpmf function. However, when I run the code, the following error appears multiple times, as the beta_binomial_lpmf sometimes gives negative value, which should not be happening given my priors. I am not sure what is causing the problem then.
Informational Message: The current Metropolis proposal is about to be rejected because of the following issue: Chain 3 Exception: beta_binomial_lpmf: First prior sample size parameter is -0.482536, but must be positive finite! (line 49, column 3 to column 60) Chain 3 If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine, but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.
The stan code I am currently using is the following one.
// sero-catalytic model
data {
int<lower=0> N_AgeG; // number of age groups
real age[N_AgeG]; // age mid-point
int N[N_AgeG]; // N individuals per age group
int NPos[N_AgeG]; // N positive per age group
}
parameters {
real <lower=0> lambda; //FOI
real <lower=0> phi; // overdispersion
}
transformed parameters {
vector[N_AgeG] z; //seroprevalence
vector[N_AgeG] a; // standard argument a of the beta function
vector[N_AgeG] b; // standard argument b of the beta function
for(k in 1:N_AgeG) {
z[k] = (1-exp(-lambda*age[k])); //seroprevalence
a[k] = z[k] * ((1/phi)- 1); //prob success
b[k] = (1 - z[k]) * ((1/phi) - 1); //prob failure
}
}
model {
lambda ~ normal(0.1, 0.1); //prior
phi ~ exponential(10);// overdispersion
// model
for(k in 1:N_AgeG){
target += beta_binomial_lpmf(NPos[k] | N[k], a[k], b[k]);
}
}
I’m not too familiar with the domain of the model you are creating, but it looks like in your transformed parameter block you are defining \alpha in such a way that allows negative values.
I wrote this short bit of R code that generates some samples from your priors and shows how negative values for \alpha are possible. I’m not sure what kind on data is in the age object, but I just used 50 as a placeholder.
lam <- rnorm(1e4, 0.1, 0.1); phi <- rexp(1e4, 10)
age <- 50
z <- 1 - exp(-lam*age)
a <- z * ((1/phi) - 1)
mean(a < 0)
Thank you for your reply! In the domain of the parameters I have specified that lamba should be positive (lower bound of zero), therefore both lambda and alpha should not assume negative values.
Thank you!
You’re right, but unfortunately the same error appears (Second prior sample size parameter is 0, but must be positive finite!) even if I use phi instead of (1/phi - 1) and give phi as a fixed value of 8. I have also increased the lower bound of lambda to 0.1 and specified it as an exponential to be sure that it could not assume any negative value or zero, but still the error appears and I am not sure what it could be causing it.
The new stan code is as follows:
data {
int<lower=0> N_AgeG; // number of age groups
real age[N_AgeG]; // age mid-point
int N[N_AgeG]; // N individuals per age group
int NPos[N_AgeG]; // N positive per age group
real phi; //where phi equal to 8
}
parameters {
real <lower=0.1> lambda; //FOI
}
transformed parameters {
vector[N_AgeG] z;
vector[N_AgeG] a; // standard argument a of the beta function
vector[N_AgeG] b; // standard argument b of the beta function
I completely missed that lower bound specification!
But it looks like now that the issue is switching over to the \beta parameter (“second prior sample size parameter”). My guess is that exp(-lambda*age[k]) is causing an underflow issue :
exp(-rexp(1e4,1)*25) %>% min()
# 2.080897e-129
And therefore (1-z[k]) becomes 0.
Maybe you’d have more luck if you change the prior back to normal(0.1,0.1)?
Thanks! Unfortunately even with normal(0.1,0.1) it gives me the same error. The lambda estimates that I obtain are quite accurate most of the time, but I cannot rely on these estimates until the error continues to appear. Moreover, the estimates are a bit underestimated when the FOI (or lambda) is supposed to be higher (i.e. high seroprevalence data) and I think it is related to the fact that when the seroprevalence z is close to 1 (because the lambda is high), then the beta goes to zero and the model does not sample any lambda value for that range.