Modeling continuous varible using splines in Stan

stan_beginer · October 24, 2020, 7:09pm

Hi,

May I ask that could Stan use splines in the modeling process (eg. use spline functions of x to model continuous y)? If so, do we have some examples? For example, how should we assign priors to spline function?

Thx!

mike-lawrence · October 25, 2020, 6:39pm

Yup, for sure. Here’s a tutorial on using splines in brms. That tutorial doesn’t go into detail on the default priors used, but remember that when doing splines, the thing you’re estimating is the weight (or contribution) of each spline as they are summed to produce the final wiggly curve. Rarely does one have much prior info on the contribution of a single given spline, so a priors assigning each weight centered on zero and ranging to cover the variability observed in the data is probably a good default (ex normal(0,1) if the data are normalized to mean=0 and sd=1).

If you want to play with splines in Stan directly, it’s pretty easy. Here’s code for a simple model where it expects you to compute the splines elsewhere and pass them in as data:

data{
	int n ;
	int num_splines ;
	vector[n] y ;
	matrix[n,num_splines] x ;
}
parameters{
	real<lower=0> noise ;
	real intercept ;
	row_vector[num_splines] weights ;
}
transformed parameters{
	vector[n] f ;
	for(i in 1:n){
		f[i] = intercept + sum( weights .* x[i,] ) ;
	}
}
model{
	noise ~ std_normal() ;
	intercept ~ std_normal() ;
	weights ~ std_normal() ;
	y ~ normal(f,noise) ;
}

stan_beginer · October 26, 2020, 3:21pm

Thanks so much!

If I would like to use B-splines in the model, it is possible for me to use it for prediction? (For example, I am modeling y on continuous x ranging within [0,100], is it possible for the spline model to predict the value of y when x falls into the range [100,200]?

benmatthewsed · October 26, 2020, 4:19pm

I know very little about splines in Stan (or in general), but Gavin Simpson has a very good blog post about extrapolating beyond the range of x to make predictions that might be useful - https://fromthebottomoftheheap.net/2020/06/03/extrapolating-with-gams/

franzsf · October 26, 2020, 4:59pm

GAMs are great, but I would be extremely cautious about using smooths to predict anything beyond the range of the data. Re: the link, I wouldn’t bury the lede:

However, in none of the fits do we get behaviour that get close to fitting the test observations beyond the training of x in the training data, even when using a Gaussian process that supposedly matches at least the general form of the true function.

mike-lawrence · October 26, 2020, 6:38pm

I echo what others have noted here whereby you have to be really careful when extrapolating from any model . For smoothing models specfically, different choices for the spline basis will yield very different extrapolation behaviour. Some will gradually revert to the mean of the observed data, others will continue any trends/periodicities that are observed in the data, etc.

Once you’ve chosen a basis, it’s easy to get the predictions, just extend the basis matrix:

data{
	int n_obs ;
	int n_tot ; \\observed plus predicted
	int num_splines ;
	vector[n_obs] y ;
	matrix[n_tot,num_splines] x ;
}
parameters{
	real<lower=0> noise ;
	real intercept ;
	row_vector[num_splines] weights ;
}
transformed parameters{
	vector[n_tot] f ;
	for(i in 1:n_tot){
		f[i] = intercept + sum( weights .* x[i,] ) ;
	}
}
model{
	noise ~ std_normal() ;
	intercept ~ std_normal() ;
	weights ~ std_normal() ;
	y ~ normal(f[1:n_obs],noise) ;
}

stan_beginer · October 26, 2020, 7:19pm

Thx! So do you mean that generally extrapolating using splines is not recommended (eg. predicting future trend)

franzsf · October 26, 2020, 7:32pm

Since splines can behave very unpredictably outside the range of the data, you really need to be careful.
As Suvorov said, “The bullet is a mad thing. Only the bayonet knows what it is about.”

mike-lawrence · October 26, 2020, 8:09pm

No, there’s no outright proscription against extrapolation, you just need to engage extrapolation very mindfully, including knowing the expected extrapolation behaviour of your tools. To achieve that knowledge, it can be very useful to play with simulated data where you can establish the ground truth and observe how well different models recover that truth.

I will say that some tools, for example polynomial smooths, have known terrible extrapolation performance, so with those there is indeed proscription against extrapolation.

Topic		Replies	Views
Modeling Penalized-Splines in Stan General	4	886	March 27, 2022
Case study on splines Modeling	7	2846	June 11, 2019
Regression Model with B-Splines not converge Modeling	6	835	March 13, 2020
Splines in Rstanarm General	4	972	April 23, 2021
Help understanding spline and extreme estimations Modeling rstan , fitting-issues , specification , splines , brms	4	158	August 8, 2024

Modeling continuous varible using splines in Stan

Related topics