Modeling count data with known measurement error

Hi there,

I have median, upper and lower estimates for count data. I want to incorporate the known error in the measured outcome (which I can derive from the upper and lower estimate) in my model. Conceptually, it looks similar to what I would do for a meta-analysis (https://mc-stan.org/docs/2_18/stan-users-guide/meta-analysis.html. However, I don’t know how it works when I have a non-normal outcome. In my case, the distribution of the outcome looks like negative binomial.

Any ideas how I could specify the model to take the known error in the count data into account?

Thanks so much in advance!

If you can formulate your problem using the mean, variance, and a normal distribution but have (objective or subjective) reasons to assume a negative binomial likelihood instead you can use a parameterization of that distribution with mean and variance instead of probability of success/failure and number of failures/successes (i.e. dispersion).

Stan doesn’t seem to have that parameterization, but you can use the “alternative parameterization” that uses mean and dispersion and use the transformed parameters block to write the variance in terms of dispersion (or vice versa) – the expressions for mean and variance in terms of other parameters are written out in the Stan documentation of the distribution formulations.

Is that what you are looking for?

1 Like

Thanks for your help! Yes, this is similar to what I ended up doing. I modelled the log of the outcome using a normal distribution with measurement error and generated quantities for the absolute outcome simply by exponentiating the estimated log outcome. But your suggestion would also work and probably be even better, i.e. using mean and standard deviation of the measurement error to compute the dispersion parameter and model the outcome with a negative binomial model.

1 Like