# Implementing quasi-poisson error model

I’m interested in testing various models of error for some discrete count data. While i can implement the `poisson` and `negbinomial` in `brms`, I don’t see an easy way to fit an overdispersed poisson (i.e. a quasipoisson) using `brms`.

My understanding is that `x ~ qpoisson(mu, theta)` then `E(x) = mu` and `var(x) = theta mu`, but if `x ~ negbinomial(mu, shape)`, then `E(x) = mu` and `var(x) =mu + mu^2/shape` . Thus, the way to parameterize `x ~ qpoisson` is with `x ~ negbinomial(mu, shape = mu/(theta - 1))`, but now `theta` is the parameter of interest, not `shape`.

Is there an easy way of implementing this without having to define my own `family`? If not, is there a reason why this family isn’t included in `brms`?

1 Like

This is because the quasi-poisson is not a distribution in itself, rather an alternative method of parameter estimation. See this post for more information: overdispersion - Quasi-likelihood/Quasi Poisson - Cross Validated

2 Likes

Ahh… I see the problem. Thanks for the clarification.

Since I’m working with discrete data where the expected value `mu` varies with the predictor `x` and variance is proportional to `k mu(x)`, not `mu(x) + mu(x)^2/shape` as in standard binomial regression. Making the shape parameter a function of `k` and `mu(x)` allows me to get this behavior I need `shape` to vary with `mu(x)`, i.e. `shape = mu/(k-1)`

1. Is there a name for such a model? It seems like parametric version of what the quasipoisson is designed to achieve.
2. Does it already exist in `brms`?

@martinmodrak is this something you’ve worked with before?

1 Like

I had some time to work on this over the weekend and this seems to work. I’m calling it a `qqpoisson` (a quasi quasi-poisson). I plan on including some additional helper functions and then submitting it here (GitHub - paul-buerkner/custom-brms-families: R scripts for custom brms families).

``````
library(tidyr)
library(dplyr)
library(brms)

## set up function
# define custom family
custom_family(
name = "qqpoisson",
dpars = c("mu", "disp"), #expectd value and variance scale (>1)
lb = c(0, 1),
ub = c(NA, NA),
type = "int" #category of response variable
)

## Define functions used in stan
## Using `theta` as a variable gives me problems when trying to fit model, so using `disp` instead
##
stan_funs <- "
real qqpoisson_lpmf(int y, real mu, real disp) {
return neg_binomial_2_lpmf(y | mu, mu/(disp-1));
}
int qqpoisson_rng(real mu, real disp) {
return neg_binomial_2_rng(mu, mu/(disp-1));
}
"
## export stan_funs to stan
stanvars <- stanvar(scode = stan_funs, block = "functions")

``````

## Simulate data

### Define variables

theta ← 3.5 # variance scale
t_vec ← 0:5
n_vec ← 1:5; # samples/t

### Create data for linear model

data ← crossing(t = t_vec, n_vec) %>%
mutate(m = 10 + 10*t) %>%
rowwise() %>%
mutate(s = rnbinom(n = 1, mu = m, size = m/(3.5 - 1)))

## Fit linear model

fit ← brm(s ~ 1 + t,
family = qqpoisson(),
data = data,
stanvars = stanvars
)

1 Like

Just to complement: there are many ways you can derive (and parametrize) the negative binomial distribution, the one used in `brms` is just one of the more common approaches. Most related to your approach, in some papers, you see negative binomial parametrized in almost the same way as you implemented: `E(x) = mu`, `Var(X) = mu ( 1 + theta)` for `theta > 0`. This is IMHO more convenient than requiring `disp > 1` as you do in your code as with `theta > 0` we can just use the standard link functions like `log` if we want to also predict `theta`.

The nice properties of this parametrization is that:
a) `1 + theta` is the Fano factor
b) If `X1 ~ NB(mu1, theta)`, `X2 ~ NB(mu2, theta)` then `X1 + X2 ~ NB(mu1 + mu2, theta)` (in general sum of two neg. bin variables is not negative binomial itself, except when their Fano factor is identical)
c) it has connections with the Dirichlet multinomial (see conditional on the total, what is the distribution of negative binomials - Cross Validated for some of the math)

You might want to submit the distribution also to GitHub - sims1253/bayesfam

Best of luck with your model!

1 Like

Thanks for the input. I agree it would be better to use `theta` rather than my dispersal parameter of `disp = theta-1` given the properties you highlight. However, I have run into a problem with using `theta` for the dispersal term. It seems like’s a reserved term in `brms/stan` since when I tried to using `theta` in my model I got the error

Error: Currently ‘dirichlet’ is the only valid prior for simplex parameters. See help(set_prior) for more details.

So right now I’m stuck using `disp = theta` rather than `theta` explicitly.

More important, the link to your post was helpful, I especially appreciate being able to find some documentation of the NB parameterization you present (e.g. from Guimaraes and Lindrooth (2007), “… known in the econometric literature adn NEGBIN type 1 (Cameron and Trivedi, 1998).”

1 Like

Oh, you are right, `theta` is AFAIK reserved for simplices in `brms` (not in Stan).