Speeding up the pre-Stan basis computations in brms?

mike-lawrence · January 11, 2021, 8:24pm

I have a GAM of the form:

power ~ mgcv::t2(
	lat , long , time , freq , accuracy_rating
	, d = c(2,1,1,1)
	, bs = c("sos","tp","tp","tp")
	, k = rep(10,times=4)
)

The data has about 2 million rows, so running brms is taking a long time (and lots of RAM; 30% of my 380GB system) even before sampling starts, presumably as it constructs all the bases. Anyone (@ucfagls? @paul.buerkner?) have any tricks? I only see one core in use; might there be any parallelism opportunities?

ucfagls · January 12, 2021, 12:32am

You’re at the limits of what mgcv can do so I’m not expecting this to be quick at all, but…

Don’t use the thin plate spline basis for anything this large; mgcv has to form the full (actually some much lower number of observations defined by max.knots, but it’s still large) TPRS basis then eigendecompose it to get the 10 largest variance basis functions you asked for. Tihs is always going to be slow.

Instead, try the cubic regression spline basis by using bs = c("sos", rep("cr", 3)).

Also, a smooth in 5 dimensions is going to generate a massive basis; you’re getting 10,000 basis functions here I think. There may not be a lot you can do about that if you want everything to vary spatially and in time and with each other.

What parallelism is available for this is with mgcv::bam() which doesn’t affect the standard basis construction code IIRC.

Topic		Replies	Views
Speed up for GAMs brms	17	5210	October 9, 2019
Knots and basis dimension in brms brms	3	2628	November 21, 2019
Hieararchical 2D+1D space-time modelling using brms brms specification	5	1165	June 7, 2020
Scalar-on-function regression using Stan Modeling	11	2235	March 4, 2018
Computational time of a (fairly complex) GAM with ARMA structure in brms Modeling fitting-issues	5	742	November 19, 2019

Speeding up the pre-Stan basis computations in brms?

Related topics