There are some misunderstandings here.
mgcv doesn’t do knot selection or counting, whether you use GCV smoothness selection or REML/ML. If you don’t specify k
, the default of -1
means the basis mgcv creates will take the defaults for whatever smoother you request. For s(x)
, this will be a thin plate regression spline (TPRS) with k = 10
, but you end up with, by default, 9 basis functions k-1
because of the identifiably constraints that are applied. You’ll get a different number of basis functions depending on the options passed to s()
or te()
, such as bs
for the smoother type, m
which controls things like the order, etc.
In the TPRS default, Simon developed a low-rank version which preserves many of the good properties of thin plate splines without needing to have one basis function per unique data/covariate value. What Simon does is create the full basis, subject the basis to an eigen decomposition, and then selects the k
(or k-1
for identifiability constraints) first eigenvectors, such that the new basis is of the requested dimension whilst preserving as much of the full-rank information in the basis as possible.
With other bases, like the CRS (bs = ‘cr’
), k
controls the number of knots, whose locations are given by the boundaries of the covariate data and evenly in between, unless specified by the user.
What mgcv does do is penalize the coefficients for the basis functions via a wiggliness penalty. So you might start with 9 TPRS basis functions (given the defaults for s()
with a 2nd order penalty), but the smoothness parameter controls how much of the wiggliness in the basis expansion you end up using, and the EDF of the resulting smoother will typically be lower than the maximum possible (9 by default). The smoothness parameter (which controls how much the penalty applies) is what is determined by GCV or REML/ML, but note you should not use GCV by default for various reasons, including the under smoothing that it is prone to.
In summary, you do have to choose the dimensionality of the basis expansion you want to use. mgcv will apply identifiability constraints so you may end up with fewer than the k=10
basis function by default. A smoothness penalty penalizes the coefficients for the basis functions so the resulting smoother may use fewer EDFs than implied by the number of basis functions.
My point above was that setting k = 20
would result in the model using 19 basis functions for that smoother, whereas the code the OP showed would have resulted in only 9 basis functions being used.
I didn’t mention this earlier, as the OP was complaining about speeding up fitting and I assumed they actually meant sampling. However, setting up the TPRS basis can take a significant amount of time with large n, because the creating the all the basis functions for the full TPRS basis and then eigen decomposing it can be costly. In such situations, you lose little by setting bs = ‘cr’
for example in the s()
call.
So far I haven’t even mentioned brms. As brms uses mgcv::smoothCon()
to set up the basis for each s()
, everything I say above applies to GAMs fitted using brms. Even the it about smoothness parameters, which are proportional to the inverse of the variance parameter of the random effect representation of the smoother that is how brms fits the models. As such, brms fits exactly the same model as mgcv::gam()
, except for the priors on parameters (implied in the case of mgcv). brms does do smoothness selection because it fits a penalized spline model - the variance parameter(s) associated with a smooth are related to the smoothness parameter(s) that mgcv selects. mgcv just happens to be ridiculously fast because of the work that Simon and other have put into developing the bespoke algorithms that are implemented in mgcv. GAMs are complex models and they are relatively slow to fit using MCMC.
One thing you can do to help in model fitting, if mgcv can fit the model you want, is prototype the model using mgcv and work out how big you need k
to be using the tools mgcv provides. Then switch the brms and fit the models with the values for k
for each smooth that you identified as being required from the mgcv fits. Or, you could just think about how much wiggliness (in terms of degrees of freedom) you expect each smooth relationship to possess, and set k
to that number of degrees of freedom plus a bit; you want to make sure the basis expansion either contains the true function or a close approximation to it, so if you expect a ~20 df smooth relationship, you might want to set k = 30
to that the basis expansion hopefully contains a function in the span of the basis that either is the true function or closely approximates it.