Multi-membership random effects model for distance matrices

Thanks for the comments. I should have been more explicit when I said that I’m putting a CAR prior on the random effects. For each element of the genetic distance matrix I have 8 covariates plus geographic distance. I’m putting a CAR prior on the random effects where the scale of the prior depends on the geographic distances between points. The way I’m thinking about it may be wrong, but I’m thinking about it as the geographic distance in the mean regression picking up the expected genetic distance as a function of geographic distance and the random effects (with the CAR prior) picking up association among residuals because of geographical proximity. Does that make sense?

By the way, if I ever get around to publishing analyses based on this conversation, I will include you in the acknowledgments while making it clear that you don’t necessarily endorse the approach.

How are you defining the adjacency matrix for the CAR piece?

This is what the model looks like:

y_i \sim \mbox{Beta}(\mu_i, \phi) \\ \mu_i = \beta_0 + \sum_{n=1}^N \beta_n x_{ni} + \epsilon_i \\ (\epsilon_1,\dots,\epsilon_I) \propto \mbox{exp}(-\frac{\tau}{2}\sum_j\sum_{k \le j}w_{jk}(\epsilon_j - \epsilon_k)^2) \\ w_{jk} = c_1 + c_2\mbox{exp}(-c_3 d_{jk}) \quad ,

where N is the number of covariates, I is the number of pairwise distances, and d_{jk} is the geographic distance between sites j and k. c_1, $c_2, and $c_3 are chosen for computational convenience and numerical stability (see JASA 104:142-154; 2009, specifically p. 147). The y_i are genetic distances between pairs of loci and the x_{ni} are covariate distances with i = \frac{(j-1)j}{2}+k. x_{1i} is the geographic distance between the pair of sites indexed by i. These are point geographical data, not areal. Note: x_{1i} = d_{jk}, where i = \frac{(j-1)j}{2}+k.

I haven’t tried combining the CAR prior with the squared difference formulation of the random effect, \epsilon_i, but that should work too (at least in the sense that it will compute).

Thanks again for taking the time to respond. I really appreciate the feedback.

1 Like

Got it; thanks.

You mention wanting to capture spatial pattern in the residuals, but the residuals don’t have unique locations in space–they are properties of pairs of sites. The spatial structure of the residuals exists in a four-dimensional space, defined by the positions of both sites, subject to a symmetry constraint due to the fact that the ordering of the pair does not matter. That is, spatial structure in the intercepts \alpha is not necessarily a good model for spatial structure in the residuals. Allowing the residuals to be spatially structured in 4D space could potentially be a good way to control for some of the non-independence, assuming that the structure of the non-independence is fundamentally spatial. I haven’t thought this through carefully though.

I still like the idea of differencing the intercepts. Two-dimensional spatial structure in the additive intercepts implies that there must be pairs of sites that, by virtue of being close together, are unusually dissimilar (these are site-pairs in spatial neighborhoods that display positive values for the intercept, which then gets added twice to the linear predictor for the pairwise dissimilarity between site-pairs in the neighborhood).