Firstly, if the counts are reasonably large, then a (variance-stabilizing) square-root transformation may be an option. Then, everything that works for normal data might just work (and you can decide to round after imputation or something like that).
Secondly, there’s some reasonably good implementations for a latent normal model (e.g. the Amelia R package), but for some reason count data is not that commonly covered (e.g. Amelia does not cover it, but covers ordinal, categorical etc. really nicely). I’m not really sure whether there’s any real reason why Poisson with a latent normal random effect is particularly hard to deal with? I would have thought not.
Thirdly, a random effects Poisson model might be a pretty decent choice. Depending on what you mean by large, rstanarm can fit such a model pretty fast (as these things go with MCMC samplers, anyway). The negative binomial model (or rather Poisson with Gamma-random effect) is usually computationally a bit more tricky (but that has been worked on a good bit, e.g. Keene, O.N., Roger, J.H., Hartley, B.F. and Kenward, M.G., 2014. Missing data sensitivity analysis for recurrent event data using controlled imputation. Pharmaceutical statistics , 13 (4), pp.258-264. or Roger, J.H., Bratton, D.J., Mayer, B., Abellan, J.J. and Keene, O.N., 2019. Treatment policy estimands for recurrent event data using data collected after cessation of randomised treatment. Pharmaceutical statistics , 18 (1), pp.85-95. ; I’m also involved in another planned one on this topic).
I’m a bit confused why a beta-binoial would be appropriate. That’s more for a binary outcome when there’s a beta-distributed random effect on the probability scale across units, is it not?