Hierarchical model with temporal autocorrelation

Hi to all!

I’m trying to fit a hierarchical model to (processed) EEG data from multiple patients with many timepoints. The predicted variable peak_alpha is the frequency (in Hz) of the peak of Alpha waves, predictor ce_mac is the MAC of the used volatile anaesthetic, time is the timestamp in seconds, different for every patient. Our hypothesis is, that peak_alpha decreases with increasing ce_mac. We want to account for differences between individuals with group-level effects, so we chose a mixed model approach.

Data looks like this:

head(data, 10)
#> # A tibble: 10 x 4
#> # Groups:   pid [1]
#>    pid    time ce_mac peak_alpha
#>    <fct> <int>  <dbl>      <dbl>
#>  1 19       22      1       8.17
#>  2 19       37      1       8.17
#>  3 19       52      1       8.01
#>  4 19       67      1       8.1 
#>  5 19       82      1       8.47
#>  6 19       97      1       8.22
#>  7 19      112      1       7.74
#>  8 19      127      1       7.88
#>  9 19      142      1       7.96
#> 10 19      157      1       8.17
tail(data, 10)
#> # A tibble: 10 x 4
#> # Groups:   pid [1]
#>    pid    time ce_mac peak_alpha
#>    <fct> <int>  <dbl>      <dbl>
#>  1 1071   3825  0.931       8.28
#>  2 1071   3840  0.935       7.91
#>  3 1071   3855  0.935       8.34
#>  4 1071   3870  0.937       8.37
#>  5 1071   3885  0.942       8.03
#>  6 1071   3900  0.940       8.2 
#>  7 1071   3915  0.941       8.26
#>  8 1071   3930  0.941       7.85
#>  9 1071   3945  0.945       8.2 
#> 10 1071   3960  0.946       8.17

Created on 2020-07-27 by the reprex package (v0.3.0)

A sample of 9 patients looks like this:

ggplot(data = sample, aes(x = ce_mac, y = peak_alpha)) +
  geom_point() +
  facet_wrap(vars(pid))

ggplot(data = sample, aes(x = time, y = ce_mac)) +
  geom_point() +
  facet_wrap(vars(pid))

ggplot(data = sample, aes(x = time, y = peak_alpha)) +
  geom_point() +
  facet_wrap(vars(pid))

Created on 2020-07-27 by the reprex package (v0.3.0)

The outcome variable is lognormally distributed and truncated with a lower bound of 6, so we ran the following model:

priors <-
  prior(normal(0, 1), class = b) +
  prior(normal(0, 1), class = sd) +
  prior(normal(0, 1), class = sigma) +
  prior(student_t(30, 2.3, 0.2), class = Intercept)

model <- brm(
  bf(peak_alpha | trunc(lb = 6) ~ ce_mac + (1 + ce_mac || pid)),
  family = lognormal(),
  prior = priors,
  iter = 4000,
  warmup = 1000,
  control = list(adapt_delta = 0.999, max_treedepth = 15)
)

check_hmc_diagnostics(model$fit)
#> 
#> Divergences:
#> 0 of 12000 iterations ended with a divergence.
#> 
#> Tree depth:
#> 0 of 12000 iterations saturated the maximum tree depth of 15.
#> 
#> Energy:
#> E-BFMI indicated no pathological behavior.

summary(model)
#>  Family: lognormal 
#>   Links: mu = identity; sigma = identity 
#> Formula: peak_alpha | trunc(lb = 6) ~ ce_mac + (1 + ce_mac || pid) 
#>    Data: pluck(helper.model_list, "data", helper.model_num) (Number of observations: 14326) 
#> Samples: 4 chains, each with iter = 4000; warmup = 1000; thin = 1;
#>          total post-warmup samples = 12000
#> 
#> Group-Level Effects: 
#> ~pid (Number of levels: 74) 
#>               Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
#> sd(Intercept)     0.23      0.02     0.19     0.27 1.00     2094     4152
#> sd(ce_mac)        0.23      0.02     0.20     0.28 1.00     2628     5195
#> 
#> Population-Level Effects: 
#>           Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
#> Intercept     2.39      0.03     2.34     2.45 1.01     1136     2071
#> ce_mac       -0.30      0.03    -0.35    -0.24 1.00     1220     2471
#> 
#> Family Specific Parameters: 
#>       Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
#> sigma     0.08      0.00     0.08     0.08 1.00    23269     8752
#> 
#> Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
#> and Tail_ESS are effective sample size measures, and Rhat is the potential
#> scale reduction factor on split chains (at convergence, Rhat = 1).

Created on 2020-07-27 by the reprex package (v0.3.0)

mcmc_plot(model.actual, type = "trace")
#> No divergences to plot.

mcmc_plot(model.actual, type = "pairs")


Created on 2020-07-27 by the reprex package (v0.3.0)

What I’m concerned about, is the fact, that we need >10000 iterations to get a reasonable ESS and the ACF plot presenting like this:

mcmc_plot(model.actual, type = "acf")


Created on 2020-07-27 by the reprex package (v0.3.0)

My question:

  • Do we have to be concerned about this? Can we trust the posterior estimates?
  • Do you need more information?

  • Operating System: CentOS Cluster
  • brms Version: 2.13.0

I guess my first question is how are you handling the time component? Does just this

peak_alpha | trunc(lb = 6) ~ ce_mac

run fine?

Hi Ara!

I think, it runs fine with much less iterations:

summary(model)
#>  Family: lognormal 
#>   Links: mu = identity; sigma = identity 
#> Formula: peak_alpha | trunc(lb = 6) ~ ce_mac 
#>    Data: pluck(helper.model_list, "data", helper.model_num) (Number of observations: 14326) 
#> Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
#>          total post-warmup samples = 4000
#> 
#> Population-Level Effects: 
#>           Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
#> Intercept     2.33      0.01     2.32     2.34 1.00     2152     2028
#> ce_mac       -0.24      0.01    -0.25    -0.22 1.00     2035     1968
#> 
#> Family Specific Parameters: 
#>       Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
#> sigma     0.16      0.00     0.16     0.16 1.00     1844     1844
#> 
#> Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
#> and Tail_ESS are effective sample size measures, and Rhat is the potential
#> scale reduction factor on split chains (at convergence, Rhat = 1).

mcmc_plot(model.actual, type = "acf")


Created on 2020-07-27 by the reprex package (v0.3.0)

I tried to incorporate an autoregressive term with ar(time = time, gr = pid, p = 1, cov = TRUE), but the model runs indefinitely for days, never converging… Even the model in my original post runs for over 12 hours and I have only used 1% of the full cleaned dataset (which is ~1.4 million observations).

1 Like

How many data points are in the data set using to test this model? Is it 0.33 of the 4.6 million?

Do you have a simulated data set that is way smaller to troubleshoot the model on?

Sorry, I had something wrong in the above post, I made an edit.

We had to thin the dataset to get the model to run. We took only data points from every 15 second, also taking rolling means of the outcome variable, partitioned the data into a training set of 75% of the data and thereof randomly selected 20% of the patients to train the model, so it’s about 1% of the cleaned data.

I tried it with a minimal sample to add the AR(1) term, but it never converged…

What I’m concerned about, is the fact, that we need >10000 iterations to get a reasonable ESS and the ACF plot presenting like this:

Your ESS and tail-area ESS look high enough, and actually quite a bit higher than needed for most purposes. As I understand it (should hear from others as well on this), autocorrelation in the chains shouldn’t bias the posterior, its just impacting ESS.

That doesn’t address the fact that you have a huge data set and the model might be slow already, but I think you’re probably running it longer than you need to.

1 Like

Thank you for your input!

So what Bulk_ESS and Tail_ESS size would you suggest? Bulk always seems to reach about 1/2 of Tail for all parameters except sigma…

It depends what you’re doing with the results. Actually I’d be more curious to see how others answer that question but here are my thoughts.

You do have lots of parameters in that model besides what’s on the printout. If it were me I’d look at stan_ess(stanfit) + theme_classic() (I use that theme because I find the default doesn’t always provide the axis labels needed to read the y-axis), and make sure they’re all over ~ 300. I always go for way more to be safe, but I don’t have to wait all week for it. Thinking off the top of my head, if I was wondering what’s enough for my model, maybe I’d start experimenting by drawing n=ESS samples from rnorm and applying whatever operations to the samples that will be applied to the posterior to get a feel for variability of results.

If you’re doing comparisons (like, posteriorer_a - posterior_b) you’ll want more samples, similar if you’re really interested in tail area probabilities.

The standard error of the estimate is useful too, Est.Error:

    Estimate Est.Error    ...

#> Intercept 2.39 0.03
#> ce_mac -0.30 0.03

I don’t think this plot shows anything over ~ 300… Should I run way more samples?

You might be okay; they’re just making you do the math on the x-axis (see x-axis is ESS/Sample size) and you’ll have to figure out how to get a printout of those ESS values so you can figure out exactly what the smallest value is.

Assuming this is the same model that you started out with in the first post, your Sample size is 12,000, so your smallest ratio on the x-axis could be around 0.03 and still potentially be not too bad.

Samples: 4 chains, each with iter = 4000; warmup = 1000; thin = 1;
#> total post-warmup samples = 12000

plot <- stan_ess(model$fit)
min(plot$data$stat)*12000
#> [1] 1086.241

Created on 2020-07-27 by the reprex package (v0.3.0)

If I am doing the math right, everything should be okay :-)

Oh yeah, that looks pretty good (though i’ve never used/seen that data$stat slot before)