Case study on splines

Bob_Carpenter · October 14, 2017, 7:56pm

We just published a case study on splines.

Splines in Stan

http://mc-stan.org/users/documentation/case-studies/splines_in_stan.html

Milad Kharratzadeh

Abstract
In this document, we discuss the implementation of splines in Stan. We start by providing a brief introduction to splines and then explain how they can be implemented in Stan. We also discuss a novel prior that alleviates some of the practical challenges of spline models.

Andre_Pfeuffer · October 17, 2017, 2:36pm

B ← t(bs(X, knots=seq(-5,5,1), degree=3, intercept = TRUE)) # creating the B-splines
num_data ← length(X); num_basis ← nrow(B)
a0 ← 0.2 # intercept
a ← rnorm(num_basis, 0, 1) # coefficients of B-splines
Y_true ← as.vector(a0X + a%%B) # generating the output

I’m confused, why we need two intercepts: a0 and the one in calling bs?

From: https://stats.stackexchange.com/questions/7316/setting-knots-in-natural-cubic-splines-in-r

You can also specify whether you want an intercept term. Normally this isn’t specified since ns is most often used in conjunction with lm, which includes an intercept implicitly (unless forced not to). If you use intercept=TRUE in your call to ns, make sure you know why you’re doing so, since if you do this and then call lm naively, the design matrix will end up being rank deficient.

Bob_Carpenter · October 22, 2017, 2:46am

I don’t know why, but I don’t see a call to lm anywhere. Is the design matrix being produced indeed rank deficient? You can deal with that with priors, but usually you don’t want it.

It doesn’t even look like well-formed R with a0 X.

I’ll email Milad, who didn’t include his email in the case study and maybe he’ll know.

Andre_Pfeuffer · October 22, 2017, 3:02am

The call to lm is the Stan program. The bs call is just a way to create a design-matrix which can be used by lm or in a Stan program, etc. The design-matrix produced by bs is not deficient. If one adds additional rows/columns to it, it may become deficient. Another subtle problem is, that the prior of the intercept in the design-matrix, may not be normal(0, tau).
It could be normal(0, 10) or normal(theta, tau) with theta ~ normal(0, 10).

Thanks for the reply,
Andre

jroon · November 22, 2018, 4:42pm

Nice worked example to follow @Bob_Carpenter . I was wondering however can you (or Milad) share the code to plot the fitted graphs? Its not clear to me how generate the fitted curve from the posteriors. Thanks.

Guido_Biele · November 22, 2018, 5:03pm

Can’t you just extract the Y_hat values from the stanfit model, calculate the mean across samples, and plot these as the predicted values?

jroon · November 22, 2018, 5:17pm

Ehh yes, turns out thats it 🤦‍♂️
Thanks … clearly I need to take a break!

liuyanguu · June 11, 2019, 12:17am

Lots of thanks to Milad and the community for this very helpful post. I am working on implementing P-splines in RStan, P-splines as proposed by Eilers and Marx (1996) that penalize second-order differences — same as mgcv::gam(log(Y) ~ s(X, bs = “ps”). This seems to be a very common application but this post is one of the most related materials I found. I am new to the Bayesian framework and would appreciate any advice on how to do such p-spline in RStan. (Using rstanarm::stan_gamm4? It might work, but I need to figure out how to set up in Stan.) Lots of thanks in advance.

Topic		Replies	Views
Modeling continuous varible using splines in Stan General	8	1235	October 26, 2020
Splines in stan case study bug? General bug	2	843	October 24, 2017
Restricted cubic splines in Stan? Modeling	19	3930	October 15, 2021
Spline fitting demo inc comparison of sparse vs non-sparse Modeling	5	1230	February 13, 2018
Splines in matched-data analysis with rstanarm rstanarm techniques , specification	3	822	April 21, 2020

Case study on splines

Related topics