Multilevel Gaussian Proccess with spectral mixture kernel

Dear colleagues in the Stan community,

I need help to build some spectral decomposition models for applications in mental health research. Any comments would be very valuable. I am also open for collaborations if someone is interested in helping further with the analysis. I intend to share the resulting templates.

My goal is to explore the underlying periodic components in behavior. Parameters may vary within each subject and according to covariates (e.g. age, biological sex, mental disorder diagnostic).

The data contains several time-series from two experiment with different duration (either n_{max}=42 or n_{max}=112, not counting missing values) . Responses are in a Likert scale (1-5) and it seems to be Bernoulli distributed (peak on 3). The data size is considerably large (2500+ time-series), although many samples contain missing values.

Currently, I am trying to deploy GPs with spectral mixture kernel from this paper, as reffered in another topic by user @flaxter.
The kernel K(x,x') is as follows (using x-x'=\tau):
k_{SM}(\tau) = \sum_{q=1}^{Q}w_{q} \prod_{p=1}^{P}\exp(-2\pi^2\tau^2_{p}v_{q}^{(p)})cos(2\pi\tau_{p}\mu_{q}^{(p)})

The weights w_{q} specify the relative contribution of each mixture component. The inverse means 1/\mu_{q} are the component periods, and the inverse standard deviations 1/\sqrt{v_{q}} length-scales, determining how quickly a component varies with the inputs x.

The corresponding code for two component mixture:

matrix[n1, n1] Sigma1;

  for (i in 1:n1) {
    Sigma1[i, i] <- var1 + var2;
    for (j in (i+1):n1) {
      Sigma1[i, j] <- var1 * exp(-(x1[i]-x1[j])^2*bw1) * cos(twopi * fabs(x1[i] - x1[j]) * period1) +
                      var2 * exp(-(x1[i]-x1[j])^2*bw2) * cos(twopi * fabs(x1[i] - x1[j]) * period2);
      Sigma1[j, i] <- Sigma1[i, j];

I intend to add hierarchical models in order to fit the multiple samples. Specifically, latent and /or random effects according to individual ID and covariate(s).

  • Writing code for simple hierarchical modeling considering random effects and covariates on Stan seems to be complex. I foresee very complicated code when changing the kernel, introducing the latent terms and dealing with missing data. Is there a straightforward way to do it? What would be the best source to get a template to tweak? Using brms (e.g.brms::bf(y~gp(x, by=covariate)) introduces a spectral density function which confuses me.

  • What is the best way to introduce translations as random effects by individual ID? Maybe adding an \alpha parameter in the cosine term, as in cos(\alpha + 2\pi\tau_{p}\mu_{q}^{(p)}) ?

  • I need to introduce covariate influences on kernel parameters. I should do multilevel modeling on \mu_{q} for frequency and w_{q} for amplitude, correct? My intuition on v_{q} remains poor, but I believe it is related to how amplitude changes over time. Does it also influence frequency?

  • The random fourier features by @flaxter based on this paper also seems to be an interesting simple alternative. If I got it right, it approximates the RBF kernel with simpler trigonometric components using linear methods. Correct if I’m wrong, but then the Fourier components would perform similarly to RBF but would not be very good to filter out other patterns from potentially “real” periodic signal generating the data.

  • Also, @bbbales2 pointed this cookbook showing a periodic kernel with the trigonometric term in \exp , for which @martinmodrak provided code. It also contains a locally periodic one:
    k_{\textrm{LocalPer}}(x,x') = k_{\textrm{Per}}(\tau)k_{\textrm{SE}}(\tau) = \exp\left(-\frac{(\tau)^2}{2\ell^2}\right) \sigma^2\exp\left(-\frac{2\sin^2(\pi|\tau|/p)}{\ell^2}\right)
    Is it a good alternative to test for the mixture kernel or is it excessively complex?

Sorry for some many questions, but each debugging interaction with these models takes a lot of time. Thanks in advance, specially for the material already available in the forum.

Best regards


Can you write this model totally in brms?

If you’re doing a complicated hierarchical model mixed with gps, then just leaning on brms is probably the simplest thing.

Is there a reason this might not work? We can do approximate GPs and fix hyperparameters to constants in brms now so I think it should be the go-to place to experiment with these things (hyperparameters:, for approx gps check the ‘k’ argument passed to gp in brms).