QUESTION: Should we continue to throw exceptions for multiplier constraints when the mulitplier is equal to zero? This can happen during warmup for models that are actually fine due to overflow to 0 on the constrained scale.
@andrewgelman would like to remove them. I bring it here for broader discussion.
@andrewgelman sent me the following example models. The first model throws exceptions during warmup when sigma_a underflows to 0, whereas the second model doesn’t throw warnings.
// model 1
data {
int N;
int J;
int<lower=1, upper=J> group[N];
vector[N] y;
real prior_mean;
real<lower=0> prior_sd;
real prior_factor;
}
parameters {
real<lower=0> sigma_y;
real<lower=0> sigma_a;
vector<multiplier=sigma_a>[J] a;
}
model {
y ~ normal(a[group], sigma_y);
a ~ normal(0, sigma_a);
sigma_a ~ lognormal(prior_mean, prior_sd);
target += prior_factor*0.5*log(sigma_a^2/(sigma_y^2*J/N + sigma_a^2));
}
The warnings are due to sigma_a
underflowing to zero during warmup:
Sampling hier_1:
Chain 1 Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
Chain 1 Exception: offset_multiplier_constrain: multiplier is 0, but must be > 0! (in '/var/folders/j8/kg_4ryhs4d78nts87zqw73wh0000gn/T/RtmplZdpzz/model-2b943e635ccc.stan', line 13, column 2 to column 34)
The normal distribution would throw a warning, but Andrew’s recoded it directly and the basic C++ math functions don’t throw exceptions because the standard library versions don’t (we have been considering throwing in Stan even where the standard library doesn’t).
The following model defines the same log density but does not throw warnings.
// model 2
data {
int N;
int J;
int<lower=1, upper=J> group[N];
vector[N] y;
real prior_mean;
real<lower=0> prior_sd;
real prior_factor;
}
parameters {
real<lower=0> sigma_y;
real<lower=0> sigma_a;
vector[J] e_a;
}
transformed parameters {
vector[J] a = sigma_a*e_a;
}
model {
y ~ normal(a[group], sigma_y);
e_a ~ normal(0, 1);
sigma_a ~ lognormal(prior_mean, prior_sd);
target += prior_factor*0.5*log(sigma_a^2/(sigma_y^2*J/N + sigma_a^2));
}