Local linear trend model

blackrabite · April 2, 2021, 1:10pm

Hello, I was trying to implement a simple local linear trend model, defined as follow:

y_t = x_t + \epsilon_t
\mu_{t+1} = \mu_t + v_t + \xi_t
v_{t+1} = v_t + \zeta_t

I used the same code provided by (RPubs - 状態空間時系列分析入門第3章 (1/3))
which I copy below:

data {
  int<lower=1> n;
  vector[n] y;
}
parameters {
  vector[n] mu;
  vector[n-1] v;
  real<lower=0> sigma_level;
  real<lower=0> sigma_drift;
  real<lower=0> sigma_irreg;
}
transformed parameters {
  vector[n] yhat;
  for(t in 1:n) {
    yhat[t] <- mu[t];
  }
}
model {
  for(t in 2:n-1)
    v[t] ~ normal(v[t-1], sigma_drift);
  for(t in 2:n)
    mu[t] ~ normal(mu[t-1] + v[t-1], sigma_level);
  for(t in 1:n)
    y[t] ~ normal(yhat[t], sigma_irreg);
}

I fitted the model on the Nile data with RStan, but I get quite poor fitting:

llt <- stan_model("fig03_01.stan")
fit <- sampling(llt, data=list(y=Nile, n=length(Nile)), chains=3, cores=3)

Warning messages:
1: There were 4 divergent transitions after warmup. See
http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
to find out why this is a problem and how to eliminate them. 
2: There were 2000 transitions after warmup that exceeded the maximum treedepth. Increase max_treedepth above 10. See
http://mc-stan.org/misc/warnings.html#maximum-treedepth-exceeded 
3: There were 3 chains where the estimated Bayesian Fraction of Missing Information was low. See
http://mc-stan.org/misc/warnings.html#bfmi-low 
4: Examine the pairs() plot to diagnose sampling problems
 
5: The largest R-hat is 1.83, indicating chains have not mixed.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#r-hat 
6: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#bulk-ess 
7: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#tail-ess

The sigma for the varying slope is especially hard to estimate:

print(fit, pars=grep("sigma.*", names(fit), value = T), digits=5)
Inference for Stan model: fig03_01.
3 chains, each with iter=2000; warmup=1000; thin=1; 
post-warmup draws per chain=1000, total post-warmup draws=3000.

             mean se_mean      sd     2.5%      25%      50%       75%     97.5% n_eff    Rhat
sigma[1]  0.43861 0.25419 0.55280  0.03073  0.10549  0.26646   0.51332   2.20960     5 1.53057
sigma[2] 84.86377 0.32654 7.39183 69.03075 80.43926 84.85241  89.78151  98.74073   512 1.01249
sigma[3] 96.38806 0.39164 9.57975 81.67447 89.60247 95.25685 101.21278 118.55008   598 1.00298

This usually replicate on any dataset I fit this model on.
any ideas on how to improve this?

thank you

Ara_Winter · April 2, 2021, 1:27pm

Morning,

Can you provide the data or a simulation of the data you are putting through the model?

blackrabite · April 2, 2021, 1:31pm

Hi Ara, the data I am using in the example is the Nile data provided (I think) in the default R installation:

> Time Series:
> Start = 1871 
> End = 1970 
> Frequency = 1 
>   [1] 1120 1160  963 1210 1160 1160  813 1230 1370 1140  995  935 1110  994 1020  960 1180  799  958 1140 1100 1210 1150 1250 1260
>  [26] 1220 1030 1100  774  840  874  694  940  833  701  916  692 1020 1050  969  831  726  456  824  702 1120 1100  832  764  821
>  [51]  768  845  864  862  698  845  744  796 1040  759  781  865  845  944  984  897  822 1010  771  676  649  846  812  742  801
>  [76] 1040  860  874  848  890  744  749  838 1050  918  986  797  923  975  815 1020  906  901 1170  912  746  919  718  714  740

https://stat.ethz.ch/R-manual/R-patched/library/datasets/html/Nile.html

Ara_Winter · April 2, 2021, 6:29pm

I’d try this and take a look at what is going on in the diagnostics and plotting yhat.

library(rstan)
library(tidyverse)
library(bayesplot)
library(ggplot2)

plot(Nile)
hist(Nile)

nile_dat ← as_tibble(Nile)
nile_dat ← tibble::rowid_to_column(nile_dat, “time”)
nile_dat

llt ← stan_model(“fig03_01.stan”)

standata ← within(list(), {
y ← as.vector(Nile)
n ← length(Nile)
})

fit ← stan(file = “./fig03_01.stan”, data = standata,
iter = 4000, chains = 4, cores = 6,
control = list(adapt_delta = 0.999, max_treedepth = 14))

Looking for a fuzzy catapiller
plot(fit, plotfun=“trace”)

list_of_draws ← rstan::extract(fit)
print(names(list_of_draws))

sampler_params ← get_sampler_params(fit, inc_warmup = FALSE)
sampler_params_chain1 ← sampler_params[[1]]
colnames(sampler_params_chain1)

mean_accept_stat_by_chain ← sapply(sampler_params, function(x) mean(x[, “accept_stat__”]))
print(mean_accept_stat_by_chain)

max_treedepth_by_chain ← sapply(sampler_params, function(x) max(x[, “treedepth__”]))
print(max_treedepth_by_chain)
check_treedepth(fit)

check_energy(fit)

yhat ← get_posterior_mean(fit, par = ‘yhat’)[, ‘mean-all chains’]
mu ← get_posterior_mean(fit, par = ‘mu’)[, ‘mean-all chains’]

sigma_drift ← get_posterior_mean(fit, par = ‘sigma_drift’)[, ‘mean-all chains’]
sigma_irreg ← get_posterior_mean(fit, par = ‘sigma_irreg’)[, ‘mean-all chains’]
sigma_level ← get_posterior_mean(fit, par = ‘sigma_level’)[, ‘mean-all chains’]

sigma_drift

sigma_irreg

sigma_level

nile_dat$yhat ← yhat
nile_dat$mu ← mu
nile_dat$lambda ← lambda

nile_dat %>% ggplot()+ geom_line(aes(x=time, y=x)) +
geom_line(aes(x=time, y=yhat, color=“red”))

Ara_Winter · April 2, 2021, 9:16pm

So you can see the model captures the broad trend but not all the spikey stuff.

blackrabite · April 2, 2021, 10:11pm

thanks, how ever even by pushing the sampler that much:

fit <- stan(file = “./fig03_01.stan”, data = standata,
iter = 4000, chains = 4, cores = 6,
control = list(adapt_delta = 0.999, max_treedepth = 14))

the n_eff for slope sigma (even all of the sigmas in this specific results, but usually is only the sigma for the slope) is pretty low:

print(fit, pars=grep("sigma.*", names(fit), value = T), digits=5)

Inference for Stan model: fig03_01.
4 chains, each with iter=4000; warmup=2000; thin=1; 
post-warmup draws per chain=2000, total post-warmup draws=8000.

              mean  se_mean       sd     2.5%      25%      50%       75%     97.5% n_eff    Rhat
sigma[1]   0.63747  0.08388  0.52301  0.09151  0.24002  0.47362   0.89227   1.99028    39 1.15591
sigma[2]  72.98018 12.74609 29.37279  1.30234 76.68098 83.45210  88.52093  98.25111     5 2.70221
sigma[3] 105.73719 10.50207 25.66906 81.45167 90.25155 96.57020 107.24404 174.62873     6 2.11286

Ara_Winter · April 2, 2021, 10:31pm

I’d suspect you can either reparameterize the model or it’s the incorrect model for the data. There are a number of state space models to choose from here: GitHub - sinhrks/stan-statespace: Stan models for state space time series

spinkney · April 3, 2021, 12:18am

Check out the walker package by @helske GitHub & BitBucket HTML Preview. It uses Stan and a Kalman filter. The vignette shows that even relatively simple state space models can suffer from fit issues with the “naive” non-kalman filter code.

Topic		Replies	Views
Semilocal Linear Trend Model, opinions/suggestions welcome Modeling	6	1597	August 13, 2018
Bayesian structural time series modeling Modeling fitting-issues	4	7892	October 19, 2017
Time series model fitting issue Modeling fitting-issues	8	525	April 2, 2020
Reproducing the parameter transformation example from the statmodeling blog Modeling	1	444	January 30, 2019
Persistent divergent transitions in simple model Modeling fitting-issues	5	541	April 17, 2020

Local linear trend model

Related topics