Correlation between Gaussian processes

Hey there,

I recently posted for the first time in this forum, and I had such quick and great responses that I can’t help it but ask another question regarding Gaussian processes that has been bugging me for a while (and for which I can’t seem to find anything relevant online). I was hoping someone could shed some light on it or point me towards the right direction.

Imagine that I have two variables \alpha_i and \beta_i for which I have basic information on the differences between indices i. In particular, I want to assume that for parameters \alpha_i those differences are characterised by a distance matrix D^{\alpha}, and for parameters \beta_i the same are characterised by a distance matrix D^{\beta}. As a starting point, I can write those such that their priors are independent multivariate distributions such that

\alpha_i \sim \text{MVNormal}\left(\bar{\alpha}, K^{\alpha}\right),
\beta_i \sim \text{MVNormal}\left(\bar{\beta}, K^{\beta}\right),
\bar{\alpha}, \bar{\beta}\sim \text{Normal}\left(0,1\right),
K^{\alpha}_{ij} = \eta_{\alpha}\,\exp(- \rho_{\alpha} {D^{\alpha}_{ij}}^2) +\delta_{ij}\,\sigma_{\alpha},
K^{\beta}_{ij} = \eta_{\beta}\,\exp(- \rho_{\beta} {D^{\beta}_{ij}}^2) +\delta_{ij}\,\sigma_{\beta},
\sigma_{\alpha}, \sigma_{\beta}, \eta_{\alpha}, \eta_{\beta} \sim \text{Exponential}\left(1\right)
\rho_{\alpha}, \rho_{\beta}\sim \text{Exponential}\left(0.5\right)

Assume, however, that I am expecting some correlation between \alpha_i and \beta_i. I can certainly do a post hoc analysis of this correlation by comparing their posterior distributions; but, I wonder if there is a way to include this correlation in the model, directly. Discussing things with people that know much more than me, I have been suggested multiple ways to do this. One way is by modelling \alpha_i and \beta_i together such that:

\begin{bmatrix} \alpha_i \\ \beta_i \end{bmatrix} \sim \text{MVNormal}\left( \begin{bmatrix} \bar{\alpha} \\ \bar{\beta} \end{bmatrix}, K\right),
K=\begin{bmatrix} K^{\alpha} & \nu\\ \nu & K^{\beta} \end{bmatrix},

where \nu is the correlation between \alpha_i and \beta_i. Unfortunately, I have found little success doing this with simulated data. This could well be a result of me messing up things along the process, but I have a feeling that my covariance structure K is missing something. More specifically, I believe that there should be a factor multiplying \nu that accounts for the covariance between alphas, and the covariance between betas (i.e. K^{\alpha} and K^{\beta}).

Alternatively, I have been suggested to account for this correlation in the prior distribution of the corresponding hyperparameters (i.e. \rho, \sigma and \eta). That is, writing their prior distributions as a multivariate distribution with several \nu measuring the pairwise correlation between hyperparameters. However, I don’t necessarily see how I can then understand the overall correlation between \alpha_i and \beta_i via the correlation of their hyperparameters. Other interesting suggestions that I was given included modelling \alpha_i and \beta_i as correlated via the LKJ and then impose spatial autocorrelation on the residuals following a Gaussian process that include the distance matrices D, but I haven’t had a chance to try this yet.

Not sure if any of this makes sense or if I am missing something trivial or already discussed in the forum/documentation, but it would be wonderful to hear some ideas on how to approach this.

Thanks in advance for your time!

I just woke up and I’m making coffee…can you write it out as a matrix variate normal? There’s an easy conversion back to multivariate normal and then investigate that correlation matrix. See Matrix normal distribution - Wikipedia

1 Like

I must admit is the first time I hear of matrix variate normal distributions. I’ll have a look and see where it takes me. Thanks a lot!

How about modelling things with an intercept & difference parameterization such that:

\alpha_i = I_i + \frac{D_i}{2}
\beta_i = I_i - \frac{D_i}{2}
I_i \sim \text{MVNormal}\left(\bar{I}, K^{I}\right),
D_i \sim \text{MVNormal}\left(\bar{D}, K^{D}\right),

1 Like

Thank you for the suggestion @mike-lawrence. How would the correlation be defined in your example? Would the K^I and K^D be defined the same as K^{\alpha} and K^{\beta}, respectively?

Yup! So, taking the time to copy your post’s formulas and doing a find-and-replace, the full thing would be:

\alpha_i = I_i + \frac{D_i}{2}
\beta_i = I_i - \frac{D_i}{2}
I_i \sim \text{MVNormal}\left(\bar{I}, K^{I}\right),
D_i \sim \text{MVNormal}\left(\bar{D}, K^{D}\right),
I_i \sim \text{MVNormal}\left(\bar{I}, K^{I}\right)
D_i \sim \text{MVNormal}\left(\bar{D}, K^{D}\right)
\bar{I}, \bar{D}\sim \text{Normal}\left(0,1\right)
K^{I}_{ij} = \eta_{I}\,\exp(- \rho_{I} {D^{I}_{ij}}^2) +\delta_{ij}\,\sigma_{I}
K^{D}_{ij} = \eta_{D}\,\exp(- \rho_{D} {D^{D}_{ij}}^2) +\delta_{ij}\,\sigma_{D}
\sigma_{I}, \sigma_{D}, \eta_{I}, \eta_{D} \sim \text{Exponential}\left(1\right)
\rho_{I}, \rho_{D}\sim \text{Exponential}\left(0.5\right)


Though be sure to look at the current recommendations for priors for the GP parameters; I think exponential at least for the \rho parameters might be expected to behave badly because it’s concentrating mass in regions of the parameter space where the data can’t distinguish between values very strongly. I always get confused about whether folks’ definition of \rho translates colloquially to “lengthscale” or “inverse-lengthscale”/“wiggliness” (latter being my term :) ), but for both you want to stay away from values very close to zero and very large values. I like a lengthscale ~ weibull(2,x) for this reason, where it’s peaked at about .8 and x is the max distance in the data (or 1 if you simply scale the data so the max distance is 1).

1 Like