functions {
real my_gamma_lcdf(real x, real a, real b) {
return -lgamma(a) + log(gamma_p(a, b * x));
}
}
data {
int<lower=0> Nseen;
array[Nseen] real<lower=11> y;
int<lower=0> Nmiss;
}
parameters {
real mu;
real sigma;
}
transformed parameters{
real<lower=0> a;
real<lower=0> b;
a = square(mu/sigma);
b = mu/square(sigma);
}
model {
mu ~ gamma(10, .6);
sigma ~ exponential(0.3);
y ~ gamma(a, b);
// target += gamma_lcdf(11 | a, b) * Nmiss;
target += my_gamma_lcdf(11 | a, b);
}
I could be off-base here, but I think that where you have:
you instead need to retain the normalizing constant. Otherwise you’re arbitrarily re-weighting the observed versus censored components of the likelihood, right?
As I understand the manual, we’re treating the censored data like missing values and integrating over them. So for y\gt11 (in my example 11 is the censoring value) we \text{Pr}(y) = \text{gamma}(y | a, b) and for y\lt11 we use the probability of having missed that many values of y given a and b. That probability is \text{gamma_CDF}(11|a, b)^n, if you have n missing values.
Is that the same as what you are describing? I’m not sure what you mean by “retain” in this context
The model is written with integrating out the censoring. Since it’s known how many points are censored and the point of censoring. This seems like a straight copy of the manual model replacing normal with gamma.
@spinkney that was indeed the intention, just to take the manual’s censoring example and change the distribution. @hhau , you’re correct that censored data would have every y less that 11 measured as exactly 11. But then, if I understand correctly, we would count all the 11s and then apply this same model for Nseen values of y > 11 and Nmiss values of y \leq 11
Yes, and I see that the the lcdf function calls gamma_p so I think it’s this function which is problematic, given that I can write the lcdf in a UDF using gamma_p explicitly and the problem persists.
For completeness, the truncated version of this model is considerably slower than the censoring version:
data {
int<lower=0> Nseen;
array[Nseen] real y;
}
parameters {
real<lower=0> a;
real<lower=0> b;
}
model {
a ~ normal(20, 10);
b ~ std_normal();
for (ii in 1 : Nseen) {
y[ii] ~ gamma(a, b) T[11, ];
}
}
presumably because the T[11, ] gets translated into the LCDF and is inside a loop.
It is also interesting that looking at the log_prob values directly:
param_grid <- expand.grid(
mu = seq(from = 10, to = 60, length.out = 2e3),
sigma = seq(from = 0.01, to = 25, length.out = 2e3)
)
# look at the log-posterior surface
log_probs <- sapply(1 : nrow(param_grid), function(ii) {
rstan::log_prob(
cens_res,
upars = rstan::unconstrain_pars(
cens_res,
list(
mu = param_grid$mu[ii],
sigma = param_grid$sigma[ii]
)
)
)
})
Yields:
Error in object@.MISC$stan_fit_instance$log_prob(upars, adjust_transform, :
Exception: grad_reg_inc_gamma: k (internal counter) exceeded 100000 iterations, gamma function gradient did not converge. (in 'anon_model', line 21, column 2 to column 42)
Quick update all. The gamma_p function is regularized - which I missed originally. The issue is resolved if you use
real my_gamma_lcdf(real x, real a, real b) {
return log(gamma_p( a, b * x));
}
For example,
functions {
real my_gamma_lcdf(real x, real a, real b) {
return log(gamma_p( a, b * x));
}
}
data {
int<lower=0> Nseen;
array[Nseen] real y;
int<lower=0> Nmiss;
real<lower=0> L;
}
transformed data {
// real b = 2.5;
}
parameters {
real<lower=0> a;
real<lower=0> b;
}
model {
y ~ gamma(a, b);
target += my_gamma_lcdf(L | a, b) * Nmiss;
}