Spline fitting demo inc comparison of sparse vs non-sparse

mike-lawrence · February 13, 2018, 2:59pm

After encountering data sets too hefty for Gaussian Processes, I’ve been exploring splines and thought I’d start sharing some demos here. I’m indebted to others for much of this code, but hope to have made some worthwhile/clarifying additions for other newbies like myself. First up is the attached demo of a simple scenario of a single continuous predictor that compares sampling performance between a regular representation of the splines versus a sparse representation. As noted, the sparse representation has better performance when the number of splines is above about 100; for significantly fewer splines, the sparse version can be dramatically worse, presumably due to overhead somewhere.
spline_fit.r (8.4 KB)
spline_fit.stan (1.5 KB)
spline_fit_sparse.stan (1.8 KB)

mike-lawrence · February 13, 2018, 3:04pm

Oh, and one thing on which I’d appreciate feedback is that I think in exploring this I discoveredt that the splines::bs() function (not a stan-related R package, I know) appends a column of zeros to the end of the spline matrix, so if you don’t notice and ask your model for as many coefficients as there are columns in that matrix, you end up having a final coefficient that is uninformed by any data, which presumably might slow down sampling. Anyone know why splines::bs() does this? Bug?

andre.pfeuffer · February 13, 2018, 3:22pm

I don’t wont to figure it out, since there is a better way to specify the number of knots by
using the degree of freedom parameter:

bs(x, df=4, degree=3)

martinmodrak · February 13, 2018, 3:24pm

I have encountered the splines::bs behavior as well. I have got used to stripping the last column. It has something to do with whether you include boundary points as your knots. I also like using the df parameter, but this gives you a zero row :-D

To sum:

#Has zero first row
splines::bs(0:10,knots = c(5), degree = 3)
splines::bs(0:10, degree = 3, df = 4)
#Has zero last column
splines::bs(0:10,knots = c(0, 5, 10), degree = 3)
#Has zero first row AND last column
splines::bs(0:10,knots = c(5, 10), degree = 3)

#Has no zero row/columns
splines::bs(0:10,knots = c(0,5), degree = 3)

mike-lawrence · February 13, 2018, 3:28pm

Hm, that specification yields an asymmetric set of splines:

Wouldn’t that be problematic?

andre.pfeuffer · February 13, 2018, 3:36pm

Isn’t that something like a sum-to-zero constraint? Anyway, if
you add a column of 1’s as Intercept, the design matrix has
full rank. It don’t see the problem at the moment.

rankMatrix(cbind(splines::bs(0:10, degree = 3, df = 4),1))
[1] 5
attr(,“method”)
[1] “tolNorm2”
attr(,“useGrad”)
[1] FALSE
attr(,“tol”)
[1] 2.442491e-15

Topic		Replies	Views
Case study on splines Modeling	7	2753	June 11, 2019
Modeling continuous varible using splines in Stan General	8	1300	October 26, 2020
Identifiability issues with monotone splines?! Modeling	19	2287	September 25, 2024
Question on spline versus Gaussian process Modeling	2	2024	September 8, 2020
Splines in Rstanarm General	4	898	April 23, 2021

Spline fitting demo inc comparison of sparse vs non-sparse

Related topics