State space models with brms

Hello, all! I’ve been futzing around with state space models recently and have been trying to figure out their proper implementation.

So, consider, for example, the Nile dataset.

library(fpp2)
library(forecast)

#from base
fitNile <- StructTS(Nile, "level")

cat("Transitional variance:", fitNile$coef["level"],
    "\n", "Observational variance:", fitNile$coef["epsilon"],
    "\n", "Initial level:", fitNile$model0$a, "\n")

autoplot(Nile) +
  geom_line(y=fitted(fitNile), col="red", lty=2) +
  ggtitle("Fit model from State Space Model")

Works great! Now, I’m trying to figure out the proper implementation in brms. Using


library(brms)
library(tidybayes)
#n.b. tidybayes is amazing - from
#https://github.com/mjskay/tidybayes

Nile_df <- as.data.frame(Nile) %>%
  mutate(t = 1871:1970) %>%
  mutate(x_cent = x-mean(x)) #centering helps

#not sure if this works - trying
#to find a better example....
nile_brms <- brm(x_cent ~ 1,
                 autocor = cor_bsts(~ t),
                 data = Nile_df)

seems to produce OK parameter estimates (although still not quite at good convergence). But, I only get a fitted mean


#plot
nile_results <- Nile_df  %>%
  add_fitted_samples(nile_brms) 

#show the fit and CIs
ggplot(nile_results, aes(x=t)) +
  stat_lineribbon(aes(y = estimate+mean(x)), .prob = c(.99, .95, .8, .5)) +
  geom_line(aes(y = x)) +
  scale_fill_brewer()

shows this quite well.

Thoughts - have any of you done something similar>

The problem is that the cor_bsts() is ignored when predicting with newdata currently (as indicated by the warning you see). Because of this and considerable convergence problems, I recently deprecated this correlation structure. See the end of https://github.com/paul-buerkner/brms/issues/148 for more details.

To get predictions including the BSTS parameters you have to run them on the existing data via fitted(nile_brms). Not sure if tidybayes has an option for this.

Here is how you can get an appropriate plot

library(tidyverse)
pred <- cbind(Nile_df, fitted(nile_brms))
names(pred) <- make.names(names(pred))
pred %>%
  ggplot(aes(t, Estimate)) +
  geom_smooth(
    aes(ymin = X2.5.ile, ymax = X97.5.ile),
    stat = "identity"
  ) + 
  geom_point(aes(t, x_cent), inherit.aes = FALSE)

You could alternatively use splines or Gaussian processes:

nile_brms2 <- brm(x_cent ~ s(t), data = Nile_df)
nile_brms3 <- brm(x_cent ~ gp(t), data = Nile_df, chains = 1)  # takes very long

They should work with tidybayes out of the box.

1 Like

Thanks - those work, but produce fairly different answers as compared to likelihood fits of state space models - and look at the data conceptually in a different way - than state space models. I’ll try bsts - http://www.unofficialgoogledatascience.com/2017/07/fitting-bayesian-structural-time-series.html - or other solution. Hrm. I feel like there should be a way to incorporate the kalman filter somehow (e.g. https://jrnold.github.io/ssmodels-in-stan/ ) but I am not a STAN wizard. It’s why I use brms! Thanks, though!

Here’s the in-progress Stan code I’ve been playing with to fit Bayesian structural time series, at least the local level model:

data {                                                                                                
  int<lower=0> T;                                                                                     
  vector[T] y;                                                                                        
}                                                                                                     
transformed data {                                                                                    
  real sd_y;                                                                                          
}                                                                                                     
parameters {                                                                                          
  vector[T] u_err;                                                                                    
  real<lower=0> u_tau;                                                                                
                                                                                                      
  real<lower=0> y_std;                                                                                
  real<lower=0> y_tau;                                                                                
}                                                                                                     
transformed parameters {                                                                              
  vector[T] u;                                                                                        
  real y_err;                                                                                         
                                                                                                      
  u[1] = y[1] + u_tau * u_err[1];                                                                     
  for (t in 2:T) {                                                                                    
    u[t] = u[t-1] + u_tau * u_err[t-1];                                                               
  }                                                                                                   
  y_err = y_tau * y_std;                                                                              
}                                                                                                     
model {                                                                                               
  // priors                                                                                           
  u_err ~ normal(0, 1);                                                                               
  u_tau ~ exponential(1); // divide by sd_?                                                                  
  y_std ~ normal(0, 1);                                                                               
  y_tau ~ exponential(1 / sd_y);                                                                      
                                                                                                      
  // likelihood                                                                                       
  y ~ normal(u, y_err);                                                                               
}                                                                                                     
generated quantities {                                                                                
  vector[T] y_pred;                                                                                   
  for (t in 1:T)                                                                                      
    y_pred[t] = normal_rng(u[t], y_err);                                                              
}  

In my opinion, the above model is still missing a reasonable default scaling for the level, u_tau. I’d appreciate some thoughts here, if anybody has them. Thanks.

1 Like