Spatial autocorrelation: its consequence on Bayesian linear regression

Hello, I’m working with spatial data which presents spatial autocorrelation (patches and / or gradient) and am performing bayesian linear regression model comparison. I am aware that within the frequentist framework, the consequence of spatial autocorrelation is to artificially reduce the p-value associated with a test (say Pearson’s correlation coefficient test), thus favouring Type I errors (false positives). What would be its consequence on Bayesian linear regression modelling, or in general within the Bayesian framework? E.g. inflating the Bayes factor?

Kind regards - SDB

There is a case study here that goes into this:
https://mc-stan.org/users/documentation/case-studies/icar_stan.html

1 Like

Hi @SDBsjr,

Welcome to the Stan forum. I can’t say how it might impact Bayes factors, but here’s some general information on the topic.

Say x is a set of samples taken across some defined space. If the observations are correlated in space, the variance of the observations will tend to be inflated, and you’ll effectively have less information about the population than you would if you had the same number of uncorrelated observations (as if you had …somehow partially…re-sampled the same point/person). If you also have some spatially correlated samples of another variable y, those properties will similarly impact the covariance of the two variables: in repeated sampling, you see more extreme values.

Say for each variable you’re able to estimate a spatial trend, the non-stationary mean around which values fluctuate. If x and y share a common spatial pattern—say, they both exhibit an east-west gradient—then the covariance will tend to be inflated; whereas if they have orthogonal patterns, the covariance will tend to be deflated. A few people out there will tell you that you should not ‘adjust’ for spatial trends, on the grounds that the common pattern is the covariation we’re interested in; others see it more like time-series data and say that the common trend is probably caused by some other factors which logically increases the chances of ‘nonsense correlations’ appearing in the data. (I certainly favor the latter view for what its worth.)

This is why you shouldn’t rely on non-spatial hypothesis tests with spatial data, and that is equally true for Bayesian and non-Bayesian models. Similarly, regression coefficient estimates will tend to have larger errors when you don’t estimate the spatial trend.

Usually people only focus on SA in the outcome variable. However, when you have independent variables with strong SA, this will independently impact your model results. Because the SA increases the variance of x but includes essentially duplicate information, the standard deviation of the posterior will tend to be overconfident. You can see this if you just simulate a bunch of data with SA; you’ll see that as the degree of SA in the covariates increases, the posterior standard deviation for its coefficient \beta (on average) decreases and the coverage rate (proportion of say 90% quantile intervals for the posterior distribution of \beta containing the correct value) also goes way down. Again, Bayesian and frequentist models respond identically.

You can see these results in this paper on Bayesian spatial filtering https://osf.io/fah3z/ (also in Spatial Statistics)particularly figures 2, 3, and 4 show what happens to regression results when you vary the degree of SA in the independent and dependent variables. Those spatial filtering models, BYM, and intrinsic autoregressive models are in the geostan R package; though its still being developed.

8 Likes

Like any other kind of regularization in frequentist estimation, it’s usually done to trade a bit of bias (compared to the MLE) for an even greater reduction in variance and hence a total reduction in estimation error.

Why do you say “artificially”? If there is spatial correlation, then this isn’t artificial in any sense. It’s like any other random effects model you might add.

We generally don’t like to use Bayes factors because they’re so sensitive to priors, which often have little effect on the posterior, rather than focusing on the posterior.

We don’t think of the Bayesian approach as trading bias for variance in the same way. We instead think of it as partial pooling of information from neighbors. And it tends to produce posteriors that have better mean estimates and better calibrated uncertainty. Again, it’s the same reason we use any kind of hierarchical model.

The BYM model discussed in the case study that @Ara_Winter cited uses both heterogeneous random effects and spatial random effects and compares inference with and without them.

We also have GP case studies for the more general CAR case.

1 Like

Why do you say “artificially”? If there is spatial correlation, then this isn’t artificial in any sense. It’s like any other random effects model you might add. @Bob_Carpenter

I think you’re both just thinking of different things—it sounds like @SDBsjr had in mind the impact of (un-modeled) spatial autocorrelation has on test statistics, whereas it sounds like @Bob_Carpenter has in mind the impact on standard errors or posterior distributions of using partial pooling methods (spatial or otherwise).

Spatial models are argued to be unbiased estimators, which is basically what I’ve found. SA impacts variances, but not means and the same is true of estimates from spatial models (on average in repeated sampling using simulation methods).

the consequence of spatial autocorrelation is to artificially reduce the p-value associated with a test (say Pearson’s correlation coefficient test), thus favouring Type I errors (false positives).

That can be found in stuff like this article below, where people were adjusting hypothesis tests for bivariate correlations by calculating an effective sample size (since SA reduces effective sample size, hence the concern with getting false positives).

Assessing the Significance of the Correlation between Two Spatial Processes

Peter Clifford, Sylvia Richardson and Denis Hemon
Biometrics
Vol. 45, No. 1 (Mar., 1989), pp. 123-134

@Bob_Carpenter this new book by Haining and Li likewise conceptualizes these models in terms of partial pooling, I think its a good introduction to the topic for people who are looking for one.

1 Like

I was more trying to point out that you don’t need to think about the effect indirectly through p-values, you can look directly at the effect on estimators.

Thanks for the reference. I wish these books weren’t all so expensive!

yeah, expensive. I get access to the pdf by signing into my university library before visiting the site; [ed. removed last clause].

1 Like

Just wanted to let you know I redacted the last sentence of your post. We don’t want to encourage violating IP licenses on these forums.