 # Combining spline and state space model

I have a bunch of time series with interventions in some, whose impact I want to model. First, I replicated the CausalImpact package in Stan to fit a state space model (thanks to another post on this forum). Basically what I get from this model is an estimate of the effect from observed post-intervention values minus the predicted values from the model. Here’s my replication of the CausalImpact package:

Now, I’d like to make a hierarchical model across interventions to account for correlations and pull out some system-wide parameters. My idea was to fit a spline to the effect values since I don’t have a formula for what effects should look like. Here’s my (lengthy) Stan model, all the spline generation code can safely be ignored (I got it from the Stan case study):

``````functions {
vector build_b_spline(real[] t, real[] ext_knots, int ind, int order);
vector build_b_spline(real[] t, real[] ext_knots, int ind, int order) {
// INPUTS:
//    t:          the points at which the b_spline is calculated
//    ext_knots:  the set of extended knots
//    ind:        the index of the b_spline
//    order:      the order of the b-spline
vector[size(t)] b_spline;
vector[size(t)] w1 = rep_vector(0, size(t));
vector[size(t)] w2 = rep_vector(0, size(t));
if (order==1)
for (i in 1:size(t)) // B-splines of order 1 are piece-wise constant
b_spline[i] = (ext_knots[ind] <= t[i]) && (t[i] < ext_knots[ind+1]);
else {
if (ext_knots[ind] != ext_knots[ind+order-1])
w1 = (to_vector(t) - rep_vector(ext_knots[ind], size(t))) /
(ext_knots[ind+order-1] - ext_knots[ind]);
if (ext_knots[ind+1] != ext_knots[ind+order])
w2 = 1 - (to_vector(t) - rep_vector(ext_knots[ind+1], size(t))) /
(ext_knots[ind+order] - ext_knots[ind+1]);
// Calculating the B-spline recursively as linear interpolation of two lower-order splines
b_spline = w1 .* build_b_spline(t, ext_knots, ind, order-1) +
w2 .* build_b_spline(t, ext_knots, ind+1, order-1);
}
return b_spline;
}
}

data {
int<lower=2> T;
int<lower=1> T0;
int<lower=1> N;
matrix[T, N] X;
vector[T] y;

int num_knots;            // num of knots
vector[num_knots] knots;  // the sequence of knots
int spline_degree;        // the degree of spline (is equal to order - 1)
real spline_points[T];
}

transformed data {
int num_basis = num_knots + spline_degree - 1; // total number of B-splines
matrix[num_basis, T] B;  // matrix of B-splines
vector[spline_degree + num_knots] ext_knots_temp;
vector[2*spline_degree + num_knots] ext_knots; // set of extended knots
ext_knots_temp = append_row(rep_vector(knots, spline_degree), knots);
ext_knots = append_row(ext_knots_temp, rep_vector(knots[num_knots], spline_degree));
for (ind in 1:num_basis)
B[ind,:] = to_row_vector(build_b_spline(spline_points, to_array_1d(ext_knots), ind, spline_degree + 1));
B[num_knots + spline_degree - 1, T] = 1;
}

parameters {
vector[T] u_err; //Slope innovation
vector[T] v_err; //Level innovation
vector[N] beta;
real<lower=0> s_obs;
real<lower=0> s_slope;
real<lower=0> s_level;a

row_vector[num_basis] a_raw;
real a0;  // intercept
real<lower=0> tau;
}

transformed parameters {
vector[T] u; //Level
vector[T] v; //Slope

row_vector[num_basis] a; // spline coefficients
vector[T] spline_hat;

// state space
u = u_err;
v = v_err;
for (t in 2:T) {
u[t] = u[t-1] + v[t-1] + s_level * u_err[t];
v[t] = v[t-1] + s_slope * v_err[t];
}

// spline values
a = a_raw;
for (i in 2:num_basis)
a[i] = a[i-1] + a_raw[i]*tau;
spline_hat = a0*to_vector(spline_points) + to_vector(a*B);
}

model {
// state space priors
u_err ~ normal(0, 1);
v_err ~ normal(0, 1);
beta ~ normal(0, 1);
s_obs ~ normal(0, 5);
s_slope ~ normal(0, 0.1);
s_level ~ normal(0, 0.1);

// spline priors
a_raw ~ normal(0, 1);
a0 ~ normal(0, 1);
tau ~ normal(0, 1);

// fitting state-space model to pre-intervention data
y[1:T0] ~ normal(u[1:T0] + X[1:T0, :]*beta, s_obs);
// fitting spline data to pre-intervention effects (~0)
spline_hat[1:T0] ~ normal(u[1:T0] + X[1:T0, :]*beta - y[1:T0], s_obs);

// fitting spline to post-intervention data
spline_hat[T0+1:T] ~ normal(u[T0+1:T] + X[T0+1:T, :]*beta - y[T0+1:T], s_obs);

}
``````

The issue is that this doesn’t converge at all. I’m trying to fit a spline to the modeled post-intervention effects but these effects are already not super-well defined. The spline and effects seem to be too correlated. How can I make the spline fit to the effect, instead of having the model fit them together?

Then if this works, I guess I’ll make a hierarchical model over spline parameters?

Thanks for any help, critiques of my overall approach are welcome!!

2 Likes

Okay, I re-wrote the same model but with better grouping of parameters on the right side of likelihood terms and it seems to help quite a bit:

``````functions {
vector build_b_spline(real[] t, real[] ext_knots, int ind, int order);
vector build_b_spline(real[] t, real[] ext_knots, int ind, int order) {
// INPUTS:
//    t:          the points at which the b_spline is calculated
//    ext_knots:  the set of extended knots
//    ind:        the index of the b_spline
//    order:      the order of the b-spline
vector[size(t)] b_spline;
vector[size(t)] w1 = rep_vector(0, size(t));
vector[size(t)] w2 = rep_vector(0, size(t));
if (order==1)
for (i in 1:size(t)) // B-splines of order 1 are piece-wise constant
b_spline[i] = (ext_knots[ind] <= t[i]) && (t[i] < ext_knots[ind+1]);
else {
if (ext_knots[ind] != ext_knots[ind+order-1])
w1 = (to_vector(t) - rep_vector(ext_knots[ind], size(t))) /
(ext_knots[ind+order-1] - ext_knots[ind]);
if (ext_knots[ind+1] != ext_knots[ind+order])
w2 = 1 - (to_vector(t) - rep_vector(ext_knots[ind+1], size(t))) /
(ext_knots[ind+order] - ext_knots[ind+1]);
// Calculating the B-spline recursively as linear interpolation of two lower-order splines
b_spline = w1 .* build_b_spline(t, ext_knots, ind, order-1) +
w2 .* build_b_spline(t, ext_knots, ind+1, order-1);
}
return b_spline;
}
}

data {
int<lower=2> T;
int<lower=1> T0;
int<lower=1> N;
matrix[T, N] X;
vector[T] y;

int num_knots;            // num of knots
vector[num_knots] knots;  // the sequence of knots
int spline_degree;        // the degree of spline (is equal to order - 1)
real spline_points[T];
}

transformed data {
int num_basis = num_knots + spline_degree - 1; // total number of B-splines
matrix[num_basis, T] B;  // matrix of B-splines
vector[spline_degree + num_knots] ext_knots_temp;
vector[2*spline_degree + num_knots] ext_knots; // set of extended knots
ext_knots_temp = append_row(rep_vector(knots, spline_degree), knots);
ext_knots = append_row(ext_knots_temp, rep_vector(knots[num_knots], spline_degree));
for (ind in 1:num_basis)
B[ind,:] = to_row_vector(build_b_spline(spline_points, to_array_1d(ext_knots), ind, spline_degree + 1));
B[num_knots + spline_degree - 1, T] = 1;
}

parameters {
vector[T] u_err; //Slope innovation
vector[T] v_err; //Level innovation
vector[N] beta;
real<lower=0> s_obs;
real<lower=0> s_slope;
real<lower=0> s_level;

row_vector[num_basis] a_raw;
real a0;  // intercept
real<lower=0> tau;
real<lower=0> s_spline;
}

transformed parameters {
vector[T] u; //Level
vector[T] v; //Slope

row_vector[num_basis] a; // spline coefficients
vector[T] spline_hat;

// state space
u = u_err;
v = v_err;
for (t in 2:T) {
u[t] = u[t-1] + v[t-1] + s_level * u_err[t];
v[t] = v[t-1] + s_slope * v_err[t];
}

// spline values
a = a_raw;
for (i in 2:num_basis)
a[i] = a[i-1] + a_raw[i]*tau;
spline_hat = a0*to_vector(spline_points) + to_vector(a*B);
}

model {
// state space priors
u_err ~ normal(0, 1);
v_err ~ normal(0, 1);
beta ~ normal(0, 1);
s_obs ~ normal(0, 5);
s_slope ~ normal(0, 0.1);
s_level ~ normal(0, 0.1);

// spline priors
a_raw ~ normal(0, 1);
a0 ~ normal(0, 1);
tau ~ normal(0, 1);
s_spline ~ normal(0, 5);

// fitting state-space model to pre-intervention data
y[1:T0] ~ normal(u[1:T0] + X[1:T0, :]*beta, s_obs);

// fitting spline as effect to make state-space prediction match observed data
y ~ normal(u + X*beta - spline_hat, s_spline);

}
``````

The spline seems to match the expected effect pretty well now:

But there must be a better way than my approach, I doubt I can successfully fit this across multiple interventions without tons of divergences (I still have some even now).

1 Like

I’ve seen a couple of these posts, I’m sorry I don’t have much to add except I’m really confused about the two step approach! How sure are you that is really what you want?

1 Like

Hi Charles, not terribly sure at all… I like the idea of estimating the impact via a state-space model but from there I’m just making things up. Any way of modeling across multiple interventions would be fine. Do you have any suggestions?

I have some discussion about intervention modelling with state space models, maybe you can translate some of the ideas? https://www.researchgate.net/profile/Charles_Driver/publication/323457904_Understanding_the_time_course_of_interventions_with_continuous_time_dynamic_models/links/5bdef77b299bf1124fba5c00/Understanding-the-time-course-of-interventions-with-continuous-time-dynamic-models.pdf

1 Like

Thanks, I’ll try to wrap my head around all this and see if I can find an alternative.