Dear all,
I am new in Bayesian statistics and I have a very simple, basic question. I’m modeling count data and I’m trying to use Bayesian statistics through brms package. My response variable has underdispersion and so “family=negbinomial()” is not adequate. In the frequentist approach, I always use the fantastic “Conway-Maxwell-Poisson” family. Can you please tell me if it is possible to use this family in the brm function and, in the affirmative case, how can I implement it?
The Conway-Maxwell-Poisson distribution is currently not supported by brms. The problem is that I haven’t found any efficient Stan implementation of this distribution yet. Once such an implementation is available, I am more than happy to implement it in brms.
Thank you very much for your attention. And Thank you very much for helping the scientific community with your fantastic package! It would be great to have the “compois” family in brms because this family is also very, very useful. I’ll be watching…
Can you please tell me what family do you recommend to use when dealing with count data with underdispersion (and not zero-inflated)?
Paul Bürkner via The Stan Forums mc_stan@discoursemail.com escreveu no dia terça, 22/01/2019 à(s) 08:35:
I have no family in brms yet which I would specifically recommend for underdispersed count data hence the need for the Conway-Maxwell-Poisson distribution.
Ok, Thank you very much for your precious help!
Paul Bürkner via The Stan Forums mc_stan@discoursemail.com escreveu no dia terça, 22/01/2019 à(s) 10:22:
The gamma-count distribution handles under-dispersion as well and can be constructed more easily. I did one time in Stan (not BRMS).
Thank you very much. Is the R code available somewhere?
Sure, see at the end. According to r - What is the appropriate model for underdispersed count data? - Cross Validated
the Generalized Poisson Distribution
also handles under-dispersion.
library(rmutil)
dgammacount
function (y, m, s, log = FALSE)
{
if (any(y < 0))
stop(“y must contain non-negative values”)
if (any(m <= 0))
stop(“m must be positive”)
if (any(s <= 0))
stop(“s must be positive”)
tmp ← ifelse(y == 0, pgamma(m * s, (y + 1) * s, 1, log.p = TRUE,
lower.tail = FALSE), log(pgamma(m * s, y * s + (y ==
0), 1) - pgamma(m * s, (y + 1) * s, 1)))
if (!log)
tmp ← exp(tmp)
tmp
}
<bytecode: 0x3269178>
<environment: namespace:rmutil>
library(rmutil)
library(rstan)
mc <- '
functions {
real gammainc(real a, real z) {
return gamma_p(a, z);
}
real gammacount(int y , real la, real a) {
if(y>0)
return gamma_p(y*a, a*la) - gamma_p((y+1)*a, a*la);
return gamma_q((y+1)*a, a*la);
}
}
model {}
'
cppcode <- stanc(model_code = mc, model_name = "gamma_inc")
## Not run:
expose_stan_functions(cppcode)
gammacount(1,1.5,0.9)
dgammacount(1,1.5,0.9)
gammacount(0,1.5,0.9)
dgammacount(0,1.5,0.9)
p<-0
for(k in 0:20) p <- p + gammacount(k,1.5,1.9)
Thank you very much for your precious help!
Are you sure that the generalized Poisson and the Gamma count distribution are the same?
Based on a quick online search I tend to think those are two different distributions.
I realize I misread your comment @andre.pfeuffer. Please disregard my former question.
I should have posted both. Note, the lambda
in poisson sense is now theta
.
The implementation follows the literature. Second, the GPD requires a constraint.
real generalized_poisson_log(int y, real lambda, real log_theta) {
real theta;
theta <- exp(log_theta);
return log_theta
+ (y - 1) * log(theta + y * lambda)
- lgamma(y + 1)
- y * lambda
- theta;
}
I’m getting a different formula for the y=0 case of the gammacount pdf:
return 1 - gamma_p(a, a*la)
which seems to be a gamma(a)
away from your version:
return gamma_q(a, a*la)
There is a lower and an upper incomplete gamma function. Notice the suffixes _p
and _q
See the integral definition in here:
https://en.wikipedia.org/wiki/Incomplete_gamma_function
Sure if we integrate from the other end, we have to use 1 - \dots
“My” version returns the same result as the R library rmutil
,
and also sums up to 1.
Sorry about that. Serves me right for letting Mathematica do my thinking for me.
Dear Paul,
Just to confirm: currently, the implementation is not available, is it?
You mean the COM-Poisson distribution? It is available, but not openly exposed ;-)
You can access it via family = brmsfamily("com_poisson")
Great!!! I will try to use it and I will let you know in case of finding troubles. Thank you very much!