McElreath Statistical Rethinking v2 - problem with 'ulam' function for HMC (tree depth, divergent transitions)

On the advice of some people from this forum, I watched Richard McElreath’s ‘Statistical Rethinking’ lectures, and have since been going through version 2 of his Rethinking book (https://xcelab.net/rm/sr2/). It’s a great book and I’ve been learning a ton so far, but have hit the first part where for me, the code just doesn’t seem to be working like it does in the book’s examples.

McElreath uses a function called ulam in his “Rethinking” package, which is an interface for doing HMC in stan. So far all the code in the book has worked perfectly, but now on the first example of using ulam / stan (rather than quadratic approximation) I receive multiple warnings, the sampling takes ages, and the output indicates the sampling has not worked properly. This is clearly supposed to be quite a simple model as it is the first one introduced for Stan. Does anyone know of effective solutions to the issues below? I am also wondering if perhaps there are some settings in stan that I should change, which might have been missed by McElreath, in order to make something like this model work? Any help would be much appreciated.

The code is as follows (from http://xcelab.net/rmpubs/sr2/code.txt):
## R code 9.9 - loads the relevant data from the rethinking package and standardises variables of interest

library(rethinking)
data(rugged)
d <- rugged
d$log_gdp <- log(d$rgdppc_2000)
dd <- d[ complete.cases(d$rgdppc_2000) , ]
dd$log_gdp_std <- dd$log_gdp / mean(dd$log_gdp)
dd$rugged_std <- dd$rugged / max(dd$rugged)
dd$cid <- ifelse( dd$cont_africa==1 , 1 , 2

# code 9.10 runs the same model with quadratic approximation

## R code 9.11 - creates a more slimmed down list of variables of interest
dat_slim <- list(
log_gpd_std = dd$log_gdp_std,
rugged_std = dd$rugged_std,
cid = as.integer( dd$cid )
)
str(dat_slim)

## R code 9.12
m9.1 <- ulam(
alist(
log_gdp_std ~ dnorm( mu , sigma ) ,
mu <- a[cid] + b[cid]*( rugged_std - 0.215 ) ,
a[cid] ~ dnorm( 1 , 0.1 ) ,
b[cid] ~ dnorm( 0 , 0.3 ) ,
sigma ~ dexp( 1 )
) ,
data=dat_slim , chains=1 )

R code 9.13

precis( m9.1 , depth=2 ) # summarises the model

When I run the ulam function, I receive the following warnings:
Warning messages:
1: There were 184 transitions after warmup that exceeded the maximum treedepth. Increase max_treedepth above 15. See
http://mc-stan.org/misc/warnings.html#maximum-treedepth-exceeded
2: There were 1 chains where the estimated Bayesian Fraction of Missing Information was low. See
http://mc-stan.org/misc/warnings.html#bfmi-low
3: Examine the pairs() plot to diagnose sampling problems

I’ve tried both specifying a larger number in tree depth and also changing adapt_delta when an error for that came up. Neither (nor both in combination) resulted in the model executing effectively. I end up with effective samples of as low as 5 or so (example code below):

m9.1a <- ulam(
alist(
log_gdp_std ~ dnorm( mu , sigma ) ,
mu <- a[cid] + b[cid]*( rugged_std - 0.215 ) ,
a[cid] ~ dnorm( 1 , 0.1 ) ,
b[cid] ~ dnorm( 0 , 0.3 ) ,
sigma ~ dexp( 1 )
) ,
data=dat_slim , chains=1, control = list(max_treedepth = 15, adapt_delta = 0.95) )

I’ve also provided a pairs plot if this helps
1%20fail

Any help would be greatly appreciated!

MacOS sierra 10.12.6
Rstudio Version 1.2.1335
R 3.5.3

I think this example was primarily intended to illustrate the concept of these varying intercept, vary slope models. But in this case, standardizing the variables like this makes it difficult for Stan to draw from this posterior distribution efficiently. @richard_mcelreath might have some suggestions.

Also, the package has its own GitHub repo with the usual issue trackers, etc. You might want to report there if you think it’s a bug.

Thanks for your responses @Bob_Carpenter and @bgoodri - perhaps I will put a message on the GitHub repo but I feel like it is not a bug, as Richard proceeds to use the same function for the whole rest of the book without issue. Are there any packages or ‘masking’ from other packages that might interfere with the proper functioning of Stan that you know of, or perhaps some checks that can be done to confirm that what has been installed will allow Stan to run as intended? I did exactly and carefully follow the installation instructions on the Stan site.

@bgoodri yes Richard goes on to bring in multilevel models but I think at this point that is not what he is trying to do, and he very explicitly and purposely brings up the standardisation of the variables as something that should be done before running the Stan model.

I don’t think it is a bug in the software, but dividing by the mean or maximum can create difficult geometry and in this case it does.

Thanks @bgoodri. I will try running the code without the standardisation, but I don’t think this should be at the root of the problem, because it apparently works without any issue when performed this way in the book.

Just a typo. In ‘dat_slim’ definition:
log_gpd_std = dd$log_gdp_std,
should be:
log_gdp_std = dd$log_gdp_std,

Regards.

@Miguel_Prol oh my goodness! Thank you! I was very confused, because I continued with the book and subsequent models were working perfectly. I could not understand why the most basic one using that function would not work.

The book was incredibly helpful for getting the underlying understanding, but I am now using brms rather than the functions presented in the book in any case.