Sum of negative binomials

Emma_Pierson · November 21, 2017, 7:15pm

Hi!

I’m using PyStan to model upvote data which is produced by a social media site. Specifically, the observed data is total_score where total_score = num_upvotes - num_downvotes.

num_upvotes and num_downvotes are long-tailed, so I’m modeling this as

num_upvotes ~ neg_binomial(mu_up, phi_up)
num_downvotes ~ neg_binomial(mu_down, phi_down)
total_score = num_upvotes - num_downvotes

because num_upvotes and num_downvotes are unobserved, I have to sum over the number of downvotes, truncating at some reasonable value.

This works pretty well – the sampled histogram of counts looks close to the true histogram of counts, parameter values seem sane, Rhat is close to 1.

But I’m getting some errors: first, early in sampling (just early in the warmup phase) I get a lot of

Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
"Exception: neg_binomial_2_lpmf: Precision parameter is inf, but must be finite! (in ‘unkown file name’ at line 79)

If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
but if this warning occurs often then your model may be either severely ill-conditioned or misspecified."

I’m not exactly sure why this is happening. Also, Rhat is sometimes nan, even though the parameter values don’t seem to be nan. I’ve tried messing around with the priors, but that doesn’t fix the problem. I’m worried perhaps it’s that the mean of the upvote negative binomial and the downvote negative binomial are correlated, and that’s making the model unidentifiable, but I’m not sure how to fix that.

Bob_Carpenter · November 28, 2017, 12:09am

Sorry for the slow response.

Sounds like you’re doing everything right. Stan’s very conservative—it likes to spit out warnings rather than letting bad behavior slip by unnoticed.

That’s not surprising for hard models to fit like negative binomials. If these really are during warmup, you don’t need to worry about it. You can try setting stepsize lower, which can help in the first few iterations.

This will happen if all the values are the same—that is, they don’t move. You get this automatically for things like diagonals of correlation matrices, which don’t move by construction.

To test that, look at the pairs plot for the parameters involved (not all the params in the model). It’s a scatterplot of pairs of parameters in the posterior and can show you if they’re correlated. Here, they’re almost certainly positively correlated given the nature of your data. A prior will weakly identify the scales of both parameters. There’s a chapter in the manual on problematic posteriors that talks about similar identifiability issues.

Emma_Pierson · November 28, 2017, 4:36pm

Thanks so much!! All this is super-helpful. I made the pairs plots – curiously, on my data, all four of the main parameters (mu_up, mu_down, phi_up, and phi_down) are only weakly correlated (r < .1 in all cases). So I guess strong correlations aren’t a problem after all, and I won’t worry too much it. Thanks again for all your help!

Topic		Replies	Views
Negative_binomial_2: should i be worried about metropolis proposal rejection in warmup phase? Modeling	2	1241	August 28, 2018
Metropolis rejection proposal due to incorrect numerical values for derived parameter Modeling	4	2187	March 23, 2018
Parameter estimation for the sum of negative binomial distributions Modeling count-data	15	1270	March 7, 2021
Truncated model for neg_binomial_2 Modeling	20	1697	June 9, 2017
Modeling problem when N obeys Poisson distribution in binomial distribution Modeling hierarchical-model	9	765	September 30, 2021

Sum of negative binomials

Related topics