Computational time of a (fairly complex) GAM with ARMA structure in brms

Dear all,

I am fitting a model for time-series analysis of Wikipedia views with STAN through the ‘brms’ package.
I came up with a pretty good distributional model, which nevertheless requires some 1st order ARMA adjustment, as observations exhibit a slight first-order autocorrelation in time.
The model is not the easiest one: it contains two splines for modeling the year and the month effect over the response variable, plus it accounts for a linear change in time of the zero inflation and the shape parameters. The code below is the one I am using on a DELL M4800, running 6 MCMC chains on 6 different cores.

mod ← brm(bf(views ~ s(month) + s(year) + period, zi ~ year, shape ~ year),
family=zero_inflated_negbinomial(link = “log”, link_shape = “log”,
link_zi = “logit”),
autocor = cor_ar(form = ~time, p = 1, cov = TRUE),
chain=6, cores=6, iter=5000, warmup = 1000,
thin = 10, refresh = 0,control = list(adapt_delta = 0.999, max_treedepth=15),
data=wpd)`

The model without the autocorrelation structure works well, although fitting is a bit time-consuming. However, fitting the model with the ARMA structure takes far too much time: I run it on a workstation with approximately 24 cores (after having adjusted the number of chains accordingly) and after about a week it wasn’t over yet.
I would like to ask you:

  • if such of a long timespan for model fitting is normal

  • if there is some gross misspecification that I did not notice in the code

  • if there is something I could do to improve the speed of model fitting. I imagine that some sort of improvement might be obtained by specifying priors, but I have no idea on how this can be done in a nonlinear setting (any documentation would be appreciated).

Any help would be greatly appreciated.DatasetBRMSGitHub.csv (5.1 KB) Find attached a reproducible dataset.

Thanks,

Jacopo Cerri

1 Like

Hi, when you talk about your cores do you always run the same amount of chains? More than four are rarely needed and it might be better (someone else would have to comment on your model) to use map_rect in Stan to split each chain among several cores. Maybe @wds15 could answer your questions then?

Hi,
yes I usually adopt 6 chains on 6 cores, or, more often 4 chains on 4 cores. I do not know whether brms actually allows to split each chain on more than 1 cores. I think this is not possible as I read that the number of cores should equate the number of chains (in the manual). In the workstation there are about 20 chains on 20 cores.
On my laptop 4 chains and 4 cores were also desperately slow at fitting the model (after 4 days it wasn’t over yet)…

Jacopo

I think cor_ar will build the whole correlation/covariance matrix by default. This can be very, very costly. (How large is time?) If you are willing to code your model in Stan directly then there are some tricks, which will make the model more efficiently.

Hi, there around 142 different time steps. I would rather keep working in brms, as I am more confident with an easier syntax (although it is less flexible).

Jacopo

Hey! That’s okay. ;)

Hm, would it be admissible to approximate the AR(1) process as another smooth function? Basically adding s(time)? Other than that, I have no idea what else could be done in brms (but, I’m no expert regarding brms!).

Also from my experience, the Negative Binomial distribution is often time consuming to fit… :/