Brms fails with SAR structure at intermediate/large sample sizes

Hello and apologies if this isn’t the right place for this!

I’m using brms to deal with autocorrelation (both spatial and temporal) in an analysis of remote sensing data. The preliminary models with limited subsets of the data (n = ~200) work great, but I’m getting an error message at higher sample sizes (~ n > 1000, but its not consistent) :

Compiling Stan program...
Start sampling
Error in mod$fit_ptr() : 
  Exception: variable does not exist; processing stage=data initialization; variable name=eigenMsar; base type=vector_d  (in 'model750d2c580e_5c0c99c82ea5eda165ec4d5a5340d18e' at line 61)

Here’s the model formula and call to brm:

brms::brmsformula(
  respons ~ pnt_prd + tm_snc + SGU + z_bins_fac2 + fSCA + prcp + tmin + tmax +
    sar(M = w0, type = 'error') +
    ar(time = year, gr = pixel, p = 1),
  family = gaussian()

brms::brm(
  formula = f0, data = cb_df, data2 = list(w0 = w0),
  prior = NULL, iter = 2000, cores = 4, chains = 4,
  silent = 0
)

It works fine without the SAR term. I’ve tried tracking down the error but, at least on the R end, eigenMsar is always there when I go looking for it with the debugger.

Here’s the head and structure of the data:

pnt_prd 	tm_snc		SGU 	z_bins_fac2       fSCA     prcp       tmin     tmax 	year   	pixel        x       y
1 46.84986      0 SandyBottoms   Very High 0.00128491 22.21589  2.9624949 18.21315 1999 1434291 -1323303 1824192
2 36.27044      0 SandyBottoms   Very High 0.00000000 22.49830  2.8694968 18.11171 1999 1258425 -1322283 1827192
3 49.37712      0 SandyBottoms   Very High 0.00000000 27.26958  2.4275658 17.06963 1999 1260171 -1322673 1827162
4 30.97303      0 SandyBottoms   Very High 0.00000000 22.61864  2.8091314 18.05812 1999 1258436 -1321953 1827192
5 44.32429      0 SandyBottoms   Very High 0.00000000 22.46512  2.8859658 18.12633 1999 1258422 -1322373 1827192
6 35.21775      0 SandyBottoms   Very High 0.00000000 40.61501 -0.6463861 13.21699 1999 1261940 -1322373 1827132

'data.frame':	14806 obs. of  33 variables:
 $ respons    : num  45.7 37.5 49.6 29.3 44 ...
 $ pnt_prd    : num  46.8 36.3 49.4 31 44.3 ...
 $ year       : int  1999 1999 1999 1999 1999 1999 1999 1999 1999 1999 ...
 $ pixel      : int  1434291 1258425 1260171 1258436 1258422 1261940 1261939 1251401 1263696 1427252 ...
 $ fSCA       : num  0.00128 0 0 0 0 ...
 $ tm_snc     : num  0 0 0 0 0 0 0 0 0 0 ...
 $ SGU        : chr  "SandyBottoms" "SandyBottoms" "SandyBottoms" "SandyBottoms" ...
 $ cst_dst    : num  715 849 977 762 857 ...
 $ x          : num  -1323303 -1322283 -1322673 -1321953 -1322373 ...
 $ y          : num  1824192 1827192 1827162 1827192 1827192 ...
 $ prcp       : num  22.2 22.5 27.3 22.6 22.5 ...
 $ tmax       : num  18.2 18.1 17.1 18.1 18.1 ...
 $ tmin       : num  2.96 2.87 2.43 2.81 2.89 ...
 $ z_bins_fac2: Factor w/ 4 levels "High","Low","Very High",..: 3 3 3 3 3 3 3 3 3 3 ...

And the code to generate the spatial weights, for context:

s0 <- sp::SpatialPoints(cb_df[, c('x', 'y')], proj4string = raster::crs("+proj=aea +lat_0=23 +lon_0=-96 +lat_1=29.5 +lat_2=45.5 +x_0=0 +y_0=0 +ellps=GRS80 +units=m +no_defs"))
d0 <- spdep::dnearneigh(s0, d1 = 0, d2 = 1000)
w0 <- spdep::nb2listwdist(d0, s0, 'idw', 'W', 1)

Attached package and R information:

Attached packages:
  
 spdep_1.1-11 sf_1.0-3     spData_2.0.1 brms_2.16.1  Rcpp_1.0.7   raster_3.5-2 sp_1.4-5
  
  R version 4.1.1 (2021-08-10)
  Platform: x86_64-pc-linux-gnu (64-bit)
  Running under: Ubuntu 21.04

Using linux kernel 5.11.

A sample of the data in package data format is at: GitHub - bmcnellis/testBRMS: Debugging repo for BRMS under the data directory.

Any help would be much appreciated!

1 Like

Hi,
that might indeed be a bug in brms. It would be best if you could produce a script+data that produces the error at least semi-reliably and file an issue at Issues · paul-buerkner/brms · GitHub - if possible use the reprex package - it might be the case that the error is driven by some weird state of your R session, so testing whether a fresh session without any clutter reproduces the issue is important.

Thanks!

Hi, I also receive an error similar to OP.

“Error in mod$fit_ptr() :
Exception: variable does not exist; processing stage=data initialization; variable name=eigenMsar; base type=vector_d (in ‘modela050688ce1b_5f2910ee5f2f3371bcc73afd2220456b’ at line 56)”

If anyone has gotten to the bottom of this then that would be great to work out how to deal with this.

Below is the code to run the model (which runs fine when sar() is not called)

fit1 <- brm(avian_FD_forest_100m ~ SES_FD.veg.UND + sar(wm_d4, type = "lag"), data = results_forest, data2 = list(wm_d4 = wm_d4), chains = 1, cores = 1)

Which version of brms are you using? Does it still occur in the latest release version, that is, brms 2.17?