I plan to run multilevel regressions on a global dataset, with random intercepts and slopes grouped by country for my main predictor x, and a Gaussian Process (also grouped by country) to account for both geographic and cultural distance between countries. In brms, this model would look like this:
model <- brm(y ~ 1 + x + gp(geoDist, culturalDist, gr = TRUE, iso = FALSE) + (1 + x | country),
data = d, family = gaussian)
It seems that when brms takes the argument gp(x), x is required to be a vector column in the dataset. However, instead of having columns in my data called geoDist and culturalDist, I have two NxN distance matrices, much like the spatial autocorrelation example in Richard McElreathās section on Gaussian Processes in Statistical Rethinking.
I donāt really understand raw Stan, so I was wondering if there would be a way to incorporate these distance matrices into the Gaussian Process using brms without having to modify the Stan code itself. Do I need to somehow reduce my distance matrices to single vectors that brms can understand? Or is there another trick Iām missing?
This approach certainly works for spatial autocorrelation, since we can use the nice coordinate system provided by longitude and latitude values. But for other measures of distance (such as āculturalā distance, etc.), there isnāt an obvious analogous coordinate system that I can see being used in the same way. Instead, I have an NxN distance matrix that has been calculated a priori. Can these distances be translated from the distance matrix into brms?
hmm, so the ādistanceā is more of a conceptual latent natureā¦ I would guess that you would be able to handle it in the same way, but Iāve never done it myselfā¦ Perhaps @Solomon could pitch in here before we ask Paul.
This thread might be of interest. @anon75146577 and I never really finished that conversation. I have a nagging feeling that one can indeed get a positive semidefinite matrix under very mild regularity conditions.
Thanks Solomon. I will have a play around with fcor(). But doesnāt this function take a covariance matrix, not a distance matrix? What if I wanted to feed in raw distances?
In the past, Iāve just converted the distance matrix to a covariance matrix manually beforehand. But I have to make some assumptions about covariance parameters before doing this. I like Gaussian Processes because they estimate those covariance parameters directly.
Thereās a pretty long history on this starting with Besag in 74. The best version is essentially Lindgren, Rue and Lindstrƶm in 2011. But there is a lot in between. As a rule arbitrary weightings either donāt work (Besag suggested the SAR construction to get around this) or they donāt end up representing the type of spatial decay people want. Just use a thin plate spline or a GP if distance is important.
Thanks Matias, I thought that this was the case, but just wanted to check. Iām not well-versed in raw Stan, so Iāve resorted to feeding pre-computed covariance matrices to brms via the gr() argument.
And thanks everyone in this thread for a really useful discussion!