Incorporating distance matrices into Gaussian Process in brms

scott.claessens · September 11, 2020, 9:59am

Hello all!

I plan to run multilevel regressions on a global dataset, with random intercepts and slopes grouped by country for my main predictor x, and a Gaussian Process (also grouped by country) to account for both geographic and cultural distance between countries. In brms, this model would look like this:

model <- brm(y ~ 1 + x + gp(geoDist, culturalDist, gr = TRUE, iso = FALSE) + (1 + x | country),
             data = d, family = gaussian)

It seems that when brms takes the argument gp(x), x is required to be a vector column in the dataset. However, instead of having columns in my data called geoDist and culturalDist, I have two NxN distance matrices, much like the spatial autocorrelation example in Richard McElreath’s section on Gaussian Processes in Statistical Rethinking.

I don’t really understand raw Stan, so I was wondering if there would be a way to incorporate these distance matrices into the Gaussian Process using brms without having to modify the Stan code itself. Do I need to somehow reduce my distance matrices to single vectors that brms can understand? Or is there another trick I’m missing?

Thanks so much in advance!

Operating System: Windows 10
brms Version: 2.13.5

torkar · September 11, 2020, 10:45am

Scott, are you saying that doing it like this,

and in particular, from the paragraph starting with,

Before we practice fitting a Gaussian process with …

doesn’t work?

scott.claessens · September 11, 2020, 12:42pm

Thanks for the quick reply!

This approach certainly works for spatial autocorrelation, since we can use the nice coordinate system provided by longitude and latitude values. But for other measures of distance (such as “cultural” distance, etc.), there isn’t an obvious analogous coordinate system that I can see being used in the same way. Instead, I have an NxN distance matrix that has been calculated a priori. Can these distances be translated from the distance matrix into brms?

torkar · September 11, 2020, 12:51pm

hmm, so the “distance” is more of a conceptual latent nature… I would guess that you would be able to handle it in the same way, but I’ve never done it myself… Perhaps @Solomon could pitch in here before we ask Paul.

maxbiostat · September 11, 2020, 1:21pm

This thread might be of interest. @anon75146577 and I never really finished that conversation. I have a nagging feeling that one can indeed get a positive semidefinite matrix under very mild regularity conditions.

Solomon · September 11, 2020, 1:34pm

The fcor() function might be appropriate. See here for an example.

torkar · September 11, 2020, 1:45pm

You mean fcor()?

Solomon · September 11, 2020, 2:27pm

Gah! Yes. I’ve edited my original comment to reflect that

scott.claessens · September 12, 2020, 12:33am

Thanks Solomon. I will have a play around with fcor(). But doesn’t this function take a covariance matrix, not a distance matrix? What if I wanted to feed in raw distances?

In the past, I’ve just converted the distance matrix to a covariance matrix manually beforehand. But I have to make some assumptions about covariance parameters before doing this. I like Gaussian Processes because they estimate those covariance parameters directly.

anon75146577 · September 12, 2020, 7:55am

There’s a pretty long history on this starting with Besag in 74. The best version is essentially Lindgren, Rue and Lindström in 2011. But there is a lot in between. As a rule arbitrary weightings either don’t work (Besag suggested the SAR construction to get around this) or they don’t end up representing the type of spatial decay people want. Just use a thin plate spline or a GP if distance is important.

Matias_Guzman_Naranjo · September 17, 2020, 1:26pm

I had asked a similar question here: Custom distance metric with gaussian process

Apparently it should work, but you have to write the model in Stan.

scott.claessens · September 17, 2020, 11:32pm

Thanks Matias, I thought that this was the case, but just wanted to check. I’m not well-versed in raw Stan, so I’ve resorted to feeding pre-computed covariance matrices to brms via the gr() argument.

And thanks everyone in this thread for a really useful discussion!

Topic		Replies	Views
Custom distance metric with gaussian process Modeling rstan , techniques , specification	10	1777	July 2, 2020
Distance matrices for very large datasets brms	4	365	May 6, 2024
Distance matrix regression brms	9	1572	July 21, 2018
Is there a simple example of non-parametric inference in brms? brms techniques , specification	3	624	April 1, 2021
Pre-computation of distance matrix in Gaussian Process? Modeling gaussian-process	4	970	July 15, 2022

Incorporating distance matrices into Gaussian Process in brms

Related topics