Big size of a vector for posterior in a truncated model in brms

Dear community,

I am trying to generate posterior residuals, R2, Loo R2, etc. based on the Poisson truncated model. In all cases, computation takes a lot of time, however, the most disappointing moment occurs when R informs that “it cannot allocate vector of size 500 or 700 Mb” for residuals, R2 and Loo R2.

Is there any possibility to compute residuals and the rest based on a small sample to speed up computation and decrease the size of the vector? For example, a small sample is used in pp_check (fit1, nsample=100).
My code is as following:

transform=data %>% mutate(ba_sqrt = sqrt(ba_m2.p), dbh_log = log(dbh_cm.p))
d=subset(transform, species.p %in% c("Acer platanoides", "Acer pseudoplatanus", "Fagus sylvatica", "Ulmus glabra"))
d1=subset(d, species.m %in% c("Acer platanoides", "Acer pseudoplatanus", "Fagus sylvatica"))
alive=subset(d1, status.p %in% c("alive"))
alive$y=as.factor(alive$year)

bprior1 <- c(prior_string("normal(0,2)", class = "Intercept"),
             prior(normal(1,1), class = b, coef = species.pAcerpseudoplatanus),
             prior(normal(3,1), class = b, coef = species.pFagussylvatica),
             prior(normal(1,1), class = b, coef = species.pUlmusglabra),
             prior(normal(3,1), class = b, coef = species.mAcerpseudoplatanus),
             prior(normal(2,1), class = b, coef = species.mFagussylvatica),
             prior(normal(3,1), class = b, coef = position.madjacent_gap),
             prior(normal(3,1), class = b, coef = position.mgap),
             prior(normal(2,1), class = b, coef = position.mcanopy),
             prior(normal(1,2), class = b, coef = ba_sqrt),
             prior(normal(1,2), class = b, coef = dbh_log),
             prior_(~exponential(4), class = ~sd))

fit1=brm(count|trunc(lb=1)~ba_sqrt+dbh_log+species.p+species.m+position.m+offset(log(population.d))+offset(log(crown_area_m2.m))+(y|subplot_id), data=alive, family=poisson(link="log"), prior=bprior1, save_pars=save_pars(all = TRUE), cores  = 4, iter = 1000 + 5000, warmup = 1000, chains = 4, seed=123, sample_prior="yes", silent=TRUE, open_progress=FALSE, control = list(adapt_delta = 0.99, max_treedepth = 15))

resid = resid(fit1)[, "Estimate"]
sresid = resid/sd(resid)
fit = fitted(fit1)[, "Estimate"]
bayes_R2(fit1)
loo_R2(fit1, cores = 4)

Data attached: poisson.csv (749.2 KB)

  • Operating System: Windows 10
  • brms Version: 2.14.0

Sorry that your question took so long to get answer.

Wouldn’t bayes_R2 also support the nsamples parameter? If not, you can definitely use the Shredder package (https://github.com/yonicd/shredder) to create fit objects with less samples. However, I guess (I have very little direct experience) that at least loo usually requires a lot of samples to reduce uncertainty…

Best of luck with your model!

Thank you, Martin!
When I use bayes_R2 (model, nsamples=100), R session is crushing.

Hpowever, I have another question. The Poisson model was overdispersed and I tried to fit a zero-truncated negative binomial model. However, sampler did not function (error below):

“Error in sampler$call_sampler(args_list[[i]]) : "
[2] " Exception: grad_2F1: k (internal counter) exceeded 100000 iterations, hypergeometric function gradient did not converge. (in ‘model3b4c1d283b2b_6fff72c76beefea5038bada19420ddd0’ at line 106)”
[1] “error occurred during calling the sampler; sampling not done”

I ran a negative binomial without truncation to exclude the influence of priors. It converged without problems. Does truncation work in negative binomial in brms? What syntax should I use to run the model?

The model is as following:
fit2=brm(count|trunc(lb=1)~ba_sqrt+dbh_log+species.p+species.m+position.m+offset(log(population.d))+offset(log(crown_area_m2.m))+(1+y|subplot_id)+(1|dbh_class:id.m), data=alive, family=negbinomial(), prior=bprior1, save_pars=save_pars(all = TRUE), cores = 4, iter = 1000 + 1000, warmup = 1000, chains = 4, seed=123, sample_prior="yes", silent=TRUE, open_progress=FALSE, control = list(adapt_delta = 0.99, max_treedepth = 15))

I attached data in the previous post. I would appreciate your help very much!

Sorry, don’t have time/energy to actually run the model, but some quick thoughts:

That looks like an bug worth reporting at Issues · paul-buerkner/brms · GitHub

Truncation should AFAIK work with neg. binomial models. This looks like Stan is unable to evaluate the gradient of neg_binomial_2_lcdf for your model.This is definitely also a bug (but in the Stan math library) - it’s been discussed before (`neg_binomial_2_lcdf()` "grad_2F1: k (internal counter) exceeded 100000 iterations" - #4 by sakrejda), but I guess it is not yet fixed.

Still I think the fact that you’ve triggered a bug that few people have run into might mean you are exploring some weird parts of the parameter space. Also because you have adapt_delta = 0.99, max_treedepth = 15 I assume you’ve had divergences without that and so there might be something wrong with your model (hard to say what and I encourage you to start a new thread to get more attention on the issue). Possibly if you resolve the original issues, you will no longer hit the bug…

Best of luck with your model!

I made an issue, but yes, this is near the boundaries of the parameter space. The gradients there are around 1e7 so that could cause problems even if the calculation is expanded (it is easy to fix if you edit one of the c++ files, as described in the issue, so the issue is more how to cover more of the parameter space with this calculation)