Correlation effects not produced with CAR structure specified

  • Operating System: iOS 10.13.5
  • brms Version: 2.8.0

I’ve got a data set (nrow = 4887) with five scaled continuous variables, one binomial response variable (0/1 ratio of 1:15 and weighted as such with data$res_weights), and a single random effect with seven levels (IDs of each tracked animal). I’m trying to fit a mixed-effect logistic regression model with brms, with each animal as a random effect. Each row is a point in space that’s roughly 5 km from each other, so spatial autocorrelation is an issue. To account for this, I attempted to include a conditional autoregressive structure with the cor_car function in the autocor argument, albeit the first time I did it, I had done it incorrectly.

dist.mat <- as.matrix(dist(cbind(data$x_coords, data$y_coords))) 
#A distance matrix, as opposed to an adjacency matrix

rownames(dist.mat) <- data$ID
test_prior <- get_prior(y | weights(res_weights) ~ var1 + var2 + var3 + var4 + var5 + (1|ID), 
                        family = bernoulli(link = 'logit'), data = data)
#I set the family to 'bernoulli' as initially running it as 'binomial' activated a 
#prompt that bernoulli was more appropriate.

test_brms.auto <- brm(y | weights(res_weights) ~  var1 + var2 + var3 + var4 + var5 + (1|ID),
                      family = bernoulli(link = 'logit'), data = data, prior = test_prior, 
                      autocor = cor_car(dist.mat, formula = ~ 1|ID), cores = 4,
                      control = list(adapt_delta = 0.99, max_treedepth = 100))

Converting all non-zero values in 'W' to 1 
#The first red flat that I didn't notice!

Funnily enough, when I ran that, the effect of the correlation was provided in the output.

When I did it the second round, I made sure to do the adjacency matrix correctly:

dist.mat <- as.matrix(dist(cbind(data$x_coords, data$y_coords))) 
W <- matrix(0, nrow = nrow(dist.mat), ncol = ncol(dist.mat))
W[W <= 5000] <- 1 
#Any distances less than or equal to 5000 m is considered 'adjacent'

rownames(W) <- data$ID
test_prior <- get_prior(y | weights(res_weights) ~ var1 + var2 + var3 + var4 + var5 + (1|ID), 
                        family = bernoulli(link = 'logit'), data = data
test_brms.auto <- brm(y | weights(res_weights) ~  var1 + var2 + var3 + var4 + var5 + (1|ID),
                      family = bernoulli(link = 'logit'), data = data, prior = test_prior, 
                      autocor = cor_car(W, formula = ~ 1|ID, type = 'icar'), cores = 4,
                      control = list(adapt_delta = 0.99, max_treedepth = 100))

#The 'type = ' argument in the 'cor_car()' function was switched to 'icar' 
#because not all points had an adjacent neighbor (i.e. there were some 0s)
#as prompted by brms. I thought was odd because I wasn't prompted to change
#this with the example from ?cor_car.

Once the model ran, though, I lost the correlation effect in the summary output. I compared this to a control model I ran without any autoregressive structures and the estimates between the two models were exactly the same, all the way down to the correlogram I made for the both of them. Running the code test_brms.auto$autocor returns car(W, ~1|ID, ‘icar’), so I’m not sure why the correlation effects are not included.

So my question is, what’s happening? Does this mean the model was treated as if it didn’t have any autoregressive structure? What was it about the first time I did it (using dist.mat where everything was a 1) that made the correlation effects section appear, as opposed to the second time I ran it (W where only some cells were 1)? Does it have something to do with my adjacency matrix, or the fact I used icar instead? There wasn’t this issue when I ran the example from the cor_car documentation (nor did that example prompt me to use ‘icar’ instead), so surely it must be something about my data or the model syntax I used.

Here’s a summary of the relevant data set, if that might be of any help:

summary(data)
   coords.x1          coords.x2              ID       y              var1.V1      
 Min.   : 7442747   Min.   :-1274976   animal1: 646   1: 329   Min.   :-2.038741  
 1st Qu.: 8356859   1st Qu.:  173477   animal2: 741   2:4558   1st Qu.:-0.779301  
 Median : 8750482   Median :  456784   animal3:1169            Median :-0.209312  
 Mean   : 8972824   Mean   :  383244   animal4: 437            Mean   : 0.000000  
 3rd Qu.: 9640674   3rd Qu.:  654646   animal5: 552            3rd Qu.: 0.758677  
 Max.   :10833105   Max.   : 1253649   animal6: 486            Max.   : 3.307332  
                                       animal7: 856   
                        
       var2.V1             var3.V1              var4.V1             var5.V1      
 Min.   :-1.224376   Min.   :-1.9800283   Min.   :-0.156237   Min.   :-3.274098  
 1st Qu.:-0.727544   1st Qu.:-0.7907740   1st Qu.:-0.111117   1st Qu.:-0.658041  
 Median :-0.269009   Median :-0.0735681   Median :-0.094682   Median :-0.016927  
 Mean   : 0.000000   Mean   : 0.0000000   Mean   : 0.000000   Mean   : 0.000000  
 3rd Qu.: 0.386599   3rd Qu.: 0.7250871   3rd Qu.:-0.060488   3rd Qu.: 0.642606  
 Max.   : 4.092347   Max.   : 2.6985385   Max.   :23.168148   Max.   : 2.608927  

Update
I ran four models: A GLMM (glmer), a brms model without accounting for spatial autocorrelation, another brms model with 5 km adjacency, and another brms model with 250 km adjacency and plotted the predicted probabilities against the scaled variables:

Not much changes between the models except for the distribution of the observed predicted variables (points, though not very obvious) and the size of the confidence (GLMM, bottom)/credible (BRMS, top three) intervals.

The sizes for each of the models are also very different:

  • GLMM: 0.3 mb
  • BRMS (spatial autocorrelation): 1.7 mb
  • BRMS (5 km adjacency): 2.5 mb
  • BRMS (250 km adjacency: 49 mb

So something is definitely happening, I just don’t know what. Any ideas or thoughts?

Thanks,

  • R version:

    platform       x86_64-apple-darwin15.6.0   
    arch           x86_64                      
    os             darwin15.6.0                
    system         x86_64, darwin15.6.0        
    status                                      
    major          3                           
    minor          5.2                         
    year           2018                        
    month          12                            
    day            20                          
    svn rev        75870                       
    language       R                           
    version.string R version 3.5.2 (2018-12-20)
    nickname       Eggshell Igloo

Hey, I am not sure if I am understanding everything correctly, but from what I see you are using the icar strucure, which by definition as no estimated correlation parameter (but still accounts for the autocorrelation). See Mitzi’s blog post about this: https://mc-stan.org/users/documentation/case-studies/icar_stan.html

Hi Paul. You understood it correctly, and that explains what I was trying to figure out, thanks. Sorry for the verbose post as well, I’ve only started thinking about temporal autocorrelation and getting into Bayesian statistics since last week so this is all very new ground for me!

Sorry for jumping on to this question a couple of years after it was originally posted. I am new(ish) to Bayesian statistics and to modelling spatial data. I am trying to figure out what the estimate of sdcar actually means. Is this equivalent to the random effect estimate of, for example (1|ID) once accounting for the spatial autocorrelation?