gkreil
June 18, 2021, 1:09pm
1
Hi,
I’ve come across this large difference in sampling behavior when fitting:
g_size <- c(7,6,30)
r_nb <- c(1.8, 2, 2.2) - .2
p_nb <- c(.4, .4, .4) + 0.1
obs <- unlist(mapply(FUN = rnbinom, g_size, r_nb, p_nb))
g_id = unlist(mapply(rep, 1:3, g_size))
df <- data.frame(y = obs, id = g_id)
barplot(with(df, table(id, y)))
brm(bf(y ~ id, shape ~ 1, family = negbinomial),
df, chains = 1)
and
brm(bf(y ~ id, shape ~ 0, family = negbinomial),
df, chains = 1)
The former uses the neg_binomial_2 likelihood, the latter neg_binomial_2_glm , and the latter will converge without problems, the former will require large max_treedepth and adapt_delta and still return hundreds of divergent transitions. I’d like to eventually fit a scale/location model so I’d like to know how to get the neg_binomial_2 model to work. Any suggestions where to start looking for solutions?
1 Like
Unfortunately I’m not able to replicate the behaviour with the example you posted, see reprex below:
g_size <- c(7,6,30)
r_nb <- c(1.8, 2, 2.2) - .2
p_nb <- c(.4, .4, .4) + 0.1
obs <- unlist(mapply(FUN = rnbinom, g_size, r_nb, p_nb))
g_id = unlist(mapply(rep, 1:3, g_size))
df <- data.frame(y = obs, id = g_id)
barplot(with(df, table(id, y)))
library(brms)
#> Loading required package: Rcpp
#> Loading 'brms' package (version 2.15.0). Useful instructions
#> can be found by typing help('brms'). A more detailed introduction
#> to the package is available through vignette('brms_overview').
#>
#> Attaching package: 'brms'
#> The following object is masked from 'package:stats':
#>
#> ar
former = brm(bf(y ~ id, shape ~ 1, family = negbinomial),
df, chains = 1)
#> Compiling Stan program...
#> Start sampling
#>
#> SAMPLING FOR MODEL '627031144159c16cf193fb304e147739' NOW (CHAIN 1).
#> Chain 1:
#> Chain 1: Gradient evaluation took 3.6e-05 seconds
#> Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.36 seconds.
#> Chain 1: Adjust your expectations accordingly!
#> Chain 1:
#> Chain 1:
#> Chain 1: Iteration: 1 / 2000 [ 0%] (Warmup)
#> Chain 1: Iteration: 200 / 2000 [ 10%] (Warmup)
#> Chain 1: Iteration: 400 / 2000 [ 20%] (Warmup)
#> Chain 1: Iteration: 600 / 2000 [ 30%] (Warmup)
#> Chain 1: Iteration: 800 / 2000 [ 40%] (Warmup)
#> Chain 1: Iteration: 1000 / 2000 [ 50%] (Warmup)
#> Chain 1: Iteration: 1001 / 2000 [ 50%] (Sampling)
#> Chain 1: Iteration: 1200 / 2000 [ 60%] (Sampling)
#> Chain 1: Iteration: 1400 / 2000 [ 70%] (Sampling)
#> Chain 1: Iteration: 1600 / 2000 [ 80%] (Sampling)
#> Chain 1: Iteration: 1800 / 2000 [ 90%] (Sampling)
#> Chain 1: Iteration: 2000 / 2000 [100%] (Sampling)
#> Chain 1:
#> Chain 1: Elapsed Time: 0.076959 seconds (Warm-up)
#> Chain 1: 0.090228 seconds (Sampling)
#> Chain 1: 0.167187 seconds (Total)
#> Chain 1:
latter = brm(bf(y ~ id, shape ~ 0, family = negbinomial),
df, chains = 1)
#> Compiling Stan program...
#> Start sampling
#>
#> SAMPLING FOR MODEL 'd123f231289692866b1610709a7e6d87' NOW (CHAIN 1).
#> Chain 1:
#> Chain 1: Gradient evaluation took 2.5e-05 seconds
#> Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.25 seconds.
#> Chain 1: Adjust your expectations accordingly!
#> Chain 1:
#> Chain 1:
#> Chain 1: Iteration: 1 / 2000 [ 0%] (Warmup)
#> Chain 1: Iteration: 200 / 2000 [ 10%] (Warmup)
#> Chain 1: Iteration: 400 / 2000 [ 20%] (Warmup)
#> Chain 1: Iteration: 600 / 2000 [ 30%] (Warmup)
#> Chain 1: Iteration: 800 / 2000 [ 40%] (Warmup)
#> Chain 1: Iteration: 1000 / 2000 [ 50%] (Warmup)
#> Chain 1: Iteration: 1001 / 2000 [ 50%] (Sampling)
#> Chain 1: Iteration: 1200 / 2000 [ 60%] (Sampling)
#> Chain 1: Iteration: 1400 / 2000 [ 70%] (Sampling)
#> Chain 1: Iteration: 1600 / 2000 [ 80%] (Sampling)
#> Chain 1: Iteration: 1800 / 2000 [ 90%] (Sampling)
#> Chain 1: Iteration: 2000 / 2000 [100%] (Sampling)
#> Chain 1:
#> Chain 1: Elapsed Time: 0.057027 seconds (Warm-up)
#> Chain 1: 0.053828 seconds (Sampling)
#> Chain 1: 0.110855 seconds (Total)
#> Chain 1:
summary(former)
#> Family: negbinomial
#> Links: mu = log; shape = log
#> Formula: y ~ id
#> shape ~ 1
#> Data: df (Number of observations: 43)
#> Samples: 1 chains, each with iter = 2000; warmup = 1000; thin = 1;
#> total post-warmup samples = 1000
#>
#> Population-Level Effects:
#> Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
#> Intercept 0.61 0.59 -0.60 1.72 1.00 736 714
#> shape_Intercept 0.72 0.47 -0.09 1.77 1.00 640 418
#> id 0.09 0.22 -0.33 0.53 1.00 847 719
#>
#> Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
#> and Tail_ESS are effective sample size measures, and Rhat is the potential
#> scale reduction factor on split chains (at convergence, Rhat = 1).
summary(latter)
#> Family: negbinomial
#> Links: mu = log; shape = log
#> Formula: y ~ id
#> shape ~ 0
#> Data: df (Number of observations: 43)
#> Samples: 1 chains, each with iter = 2000; warmup = 1000; thin = 1;
#> total post-warmup samples = 1000
#>
#> Population-Level Effects:
#> Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
#> Intercept 0.64 0.64 -0.55 1.92 1.00 1059 727
#> id 0.08 0.24 -0.38 0.54 1.00 1102 517
#>
#> Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
#> and Tail_ESS are effective sample size measures, and Rhat is the potential
#> scale reduction factor on split chains (at convergence, Rhat = 1).
Created on 2021-06-18 by the reprex package (v2.0.0)
gkreil
June 22, 2021, 6:12am
3
gkreil:
brm(bf(y ~ id, shape ~ 0, family = negbinomial),
df, chains = 1)
Hi Andri,
I should have kept the seed I used to simulate. But I couldn’t reproduce the problem neither repeating the simulation.
But I managed with a different parametrization:
set.seed(21)
g_size <- c(7,6,30)
r_nb <- c(5.4, 8, 10.4)
p_nb <- c(.7, .7, .7)
obs <- unlist(mapply(FUN = rnbinom, g_size, r_nb, p_nb))
g_id = unlist(mapply(rep, 1:3, g_size))
df <- data.frame(y = obs, id = g_id)
# barplot(with(df, table(id, y)))
brm(bf(y ~ id, shape ~ 1, family = negbinomial),
df, chains = 1)
brm(bf(y ~ id, shape ~ 0, family = negbinomial),
df, chains = 1)
It will give post warmup samples that either reach max_treedepth or are divergent transitions. Increasing max_treedepth will eventually lead to a very large number of divergent transitions and large rhats
1 Like
gkreil
September 13, 2021, 8:25am
4
@andrjohns would you mind taking another look at this one? I’m stuck here otherwise.
This behaviour appears to be due to a bug in the older version of the negbinomial family in rstan 2.21. When running with the cmdstanr backend there are no sampling issues
3 Likes
gkreil
September 23, 2021, 8:44am
6
So for anybody reading, options(brms.backend = "cmdstan") does the trick. Thanks @andrjohns