State space models with brms


#1

Hello, all! I’ve been futzing around with state space models recently and have been trying to figure out their proper implementation.

So, consider, for example, the Nile dataset.

library(fpp2)
library(forecast)

#from base
fitNile <- StructTS(Nile, "level")

cat("Transitional variance:", fitNile$coef["level"],
    "\n", "Observational variance:", fitNile$coef["epsilon"],
    "\n", "Initial level:", fitNile$model0$a, "\n")

autoplot(Nile) +
  geom_line(y=fitted(fitNile), col="red", lty=2) +
  ggtitle("Fit model from State Space Model")

Works great! Now, I’m trying to figure out the proper implementation in brms. Using


library(brms)
library(tidybayes)
#n.b. tidybayes is amazing - from
#https://github.com/mjskay/tidybayes

Nile_df <- as.data.frame(Nile) %>%
  mutate(t = 1871:1970) %>%
  mutate(x_cent = x-mean(x)) #centering helps

#not sure if this works - trying
#to find a better example....
nile_brms <- brm(x_cent ~ 1,
                 autocor = cor_bsts(~ t),
                 data = Nile_df)

seems to produce OK parameter estimates (although still not quite at good convergence). But, I only get a fitted mean


#plot
nile_results <- Nile_df  %>%
  add_fitted_samples(nile_brms) 

#show the fit and CIs
ggplot(nile_results, aes(x=t)) +
  stat_lineribbon(aes(y = estimate+mean(x)), .prob = c(.99, .95, .8, .5)) +
  geom_line(aes(y = x)) +
  scale_fill_brewer()

shows this quite well.

Thoughts - have any of you done something similar>


#2

The problem is that the cor_bsts() is ignored when predicting with newdata currently (as indicated by the warning you see). Because of this and considerable convergence problems, I recently deprecated this correlation structure. See the end of https://github.com/paul-buerkner/brms/issues/148 for more details.

To get predictions including the BSTS parameters you have to run them on the existing data via fitted(nile_brms). Not sure if tidybayes has an option for this.


#3

Here is how you can get an appropriate plot

library(tidyverse)
pred <- cbind(Nile_df, fitted(nile_brms))
names(pred) <- make.names(names(pred))
pred %>%
  ggplot(aes(t, Estimate)) +
  geom_smooth(
    aes(ymin = X2.5.ile, ymax = X97.5.ile),
    stat = "identity"
  ) + 
  geom_point(aes(t, x_cent), inherit.aes = FALSE)

#4

You could alternatively use splines or Gaussian processes:

nile_brms2 <- brm(x_cent ~ s(t), data = Nile_df)
nile_brms3 <- brm(x_cent ~ gp(t), data = Nile_df, chains = 1)  # takes very long

They should work with tidybayes out of the box.


#5

Thanks - those work, but produce fairly different answers as compared to likelihood fits of state space models - and look at the data conceptually in a different way - than state space models. I’ll try bsts - http://www.unofficialgoogledatascience.com/2017/07/fitting-bayesian-structural-time-series.html - or other solution. Hrm. I feel like there should be a way to incorporate the kalman filter somehow (e.g. https://jrnold.github.io/ssmodels-in-stan/ ) but I am not a STAN wizard. It’s why I use brms! Thanks, though!


#6

Here’s the in-progress Stan code I’ve been playing with to fit Bayesian structural time series, at least the local level model:

data {                                                                                                
  int<lower=0> T;                                                                                     
  vector[T] y;                                                                                        
}                                                                                                     
transformed data {                                                                                    
  real sd_y;                                                                                          
}                                                                                                     
parameters {                                                                                          
  vector[T] u_err;                                                                                    
  real<lower=0> u_tau;                                                                                
                                                                                                      
  real<lower=0> y_std;                                                                                
  real<lower=0> y_tau;                                                                                
}                                                                                                     
transformed parameters {                                                                              
  vector[T] u;                                                                                        
  real y_err;                                                                                         
                                                                                                      
  u[1] = y[1] + u_tau * u_err[1];                                                                     
  for (t in 2:T) {                                                                                    
    u[t] = u[t-1] + u_tau * u_err[t-1];                                                               
  }                                                                                                   
  y_err = y_tau * y_std;                                                                              
}                                                                                                     
model {                                                                                               
  // priors                                                                                           
  u_err ~ normal(0, 1);                                                                               
  u_tau ~ exponential(1); // divide by sd_?                                                                  
  y_std ~ normal(0, 1);                                                                               
  y_tau ~ exponential(1 / sd_y);                                                                      
                                                                                                      
  // likelihood                                                                                       
  y ~ normal(u, y_err);                                                                               
}                                                                                                     
generated quantities {                                                                                
  vector[T] y_pred;                                                                                   
  for (t in 1:T)                                                                                      
    y_pred[t] = normal_rng(u[t], y_err);                                                              
}  

In my opinion, the above model is still missing a reasonable default scaling for the level, u_tau. I’d appreciate some thoughts here, if anybody has them. Thanks.