Bym and besag models in brms

Paul_Lantos · August 31, 2019, 8:25pm

That’s great news, thanks for working on this. I just gave it a try with a simple model and my R session aborts - any ideas?

anon75146577 · September 1, 2019, 4:32am

Do you have any more information?

Paul_Lantos · September 1, 2019, 11:25am

There isn’t really any more information … It starts compiling the model normally, then the R session aborted thing pops up a minute later.

anon75146577 · September 1, 2019, 4:54pm

What’s the model. What’s the size. What’s the version of R and brms. What’s the operating system. How much memory do you have.

Most importantly can you send a full example where it breaks so we can look at it.

We can’t do much without that.

Paul_Lantos · September 1, 2019, 5:37pm

Gotcha - R 3.6.0, brms 2.10.0, running on Windows with 16 GB RAM (and I’ve been able to handle much more complicated models than this on this system).

This is as far as the model goes before R aborts. It’s a zero inflated dataset so this example is using family=“negbinomial”, but it happens with Poisson as well. The model I actually wish to use this for is a much more complicated zero inflated neg binomial model with a number of predictors and a random intercept term - I’ve done this successfully using a GAM of the polygon centroids, but I’d like to compare that to the CAR model.

The script and the data are attached.

(Attachment data.shx is missing)

(Attachment data.prj is missing)

(Attachment data.dbf is missing)

(Attachment data.shp is missing)

bym_brms.R (439 Bytes)

Paul_Lantos · September 1, 2019, 5:51pm

Well, attachment was rejected (I’d attached a shapefile) but you can see the syntax in my R file.

Paul_Lantos · September 2, 2019, 4:33am

I tried to construct my adjacency matrix differently, yet still get R aborted session errors. My data are in a shapefile, which I can’t load to Stan, but here is the model syntax, just a simple intercept only model.

data <- st_read( “data.shp” )

centroids_sf <- st_centroid(st_geometry(data), of_largest_polygon=TRUE)
coords_sf <- st_coordinates(centroids_sf)
neigh <- knn2nb(knearneigh(coords_sf, k=6))
w1 <- nb2mat(make.sym.nb(neigh), style=“B”)
rownames(w1) <- data$bg

model1 <- brm( bf( y ~ 1 ),
family=“negbinomial” ,
data=data ,
autocor = cor_car( W=w1 , ~1 | bg , type=“bym2” ) ,
prior=prior( normal( 0 , 1 ) , class=Intercept ) ,
chains=4 )

Paul_Lantos · September 2, 2019, 1:31pm

Well I’ve made a progress - I simulated a Poisson dataset and the model ran fine. My primary data are very zero-inflated, and I suspect R just crashed without an informative error when trying to fit the CAR model with it.

Since then I’ve constructed the adjacency matrix as above, and it now seems to run correctly:

model2 <- brm( bf( cases~ 1 ,
zi ~ 1 ),
autocor = cor_car( W=w1 , ~1 | bg , type=“bym2” ) ,
prior=prior( normal( 0 , 1 ) , class=Intercept ) ,
data=data,
family = “zero_inflated_negbinomial” ,
cores=4)

I do get this error that I’m hoping to correct just by increasing iterations, treedepth, and adapt delta:

Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
Running the chains for more iterations may help. See
[http://mc-stan.org/misc/warnings.html#bulk-ess](http://mc-stan.org/misc/warnings.html#bulk-ess)

Paul_Lantos · September 14, 2019, 3:10pm

Thanks for the help, and for implementing these structures.

I have a question - I’m doing a multi-level model where I have individual observations aggregated by spatial units. I don’t understand how the model knows where the individuals are within the adjacency matrix, though:

model1 <- brm( cmvresult ~ race + z_mage + infantsex + z_sdi ,
data = pts ,
family = “Bernoulli” ,
prior=c( prior( normal( -5.5 , 0.5 ) , class=Intercept ) ,
prior( normal( 0 , 0.5 ) , class=b , coef = infantsexMale ) ,
prior( normal( 0 , 0.5 ) , class=b , coef = raceAsian ) ,
prior( normal( 0 , 0.5 ) , class=b , coef = raceHispanic ) ,
prior( normal( 0 , 0.5 ) , class=b , coef = raceMultiple ) ,
prior( normal( 0 , 0.5 ) , class=b , coef = raceNativeAmerican ) ,
prior( normal( 0 , 0.5 ) , class=b , coef = z_mage ) ,
prior( normal( 0 , 0.5 ) , class=b , coef = z_sdi ) ) ,
autocor = cor_car( W=w_zips , ~1|zipcode , type=“bym2” ) ,
chains = 4 ,
iter = 2000 ,
cores = 4 )

All the fixed effects are at the individual level - there are 96,000 individuals, aggregated into 701 zip code polygons.

The zip code adjacencies are specified in the autocor function, but how does the model know where the individuals are? Is it that the ~1|zipcode is directed to the individual data?

Thanks

paul.buerkner · September 16, 2019, 7:39pm

I am not sure I fully understand the question I am afraid. Since each individual only occurs in a specific zipcode the nesting is implictely part of the model structure as for any other multilevel model.

Paul_Lantos · September 16, 2019, 10:12pm

Let me rephrase - the zip code adjacency structure is present in the matrix, but the matrix does not have names of the zips. So when I put ~1|zipcode in the autocor formula, that grouping identifier exists only in the individual level data but not in the adjacency matrix. So how does the model connect the two?

paul.buerkner · September 17, 2019, 6:26am

But the matrix is of the same size as the number of zipcodes, right?

brms checks if the (row) names I believe match, but if you see a case where
an unname W gets through with a grouping factor, please provide a reproducible
example, because this should not happen.

Paul_Lantos · September 17, 2019, 8:51am

Yes, the matrix is 701 x 701, which is the number of zipcodes.

Now, the matrix was made from a shapefile of just zipcodes.

When I put in the ~1|zipcode term in the autocor specification, I assume that looks for the zipcode column in my data source (which is 96,000 observations), right? So that would group all the observations by zipcode, but this doesn’t really correspond to the row numbers in the shapefile of the zipcodes.

paul.buerkner · September 17, 2019, 8:58am

Yes, this is how the zipcode variable is looked up.

At the same time the (row) names ot the W matrix are mapped to the unique values of zipcode
so that each of the 96k observations gets its right zipcode parameter. In addition, the zipcode parameters are spatially correlated according to W.

Paul_Lantos · September 17, 2019, 10:09am

Great, thanks - that makes sense, thanks.

Paul

Topic		Replies	Views
Bym2 General rstan , fitting-issues , cmdstanr , hierarchical-model , brms	26	1418	February 14, 2023
BYM2 with unstructured country random effect Modeling	39	2500	February 11, 2022
Hierarchical CAR model in brms Modeling	5	1502	June 21, 2021
Autoregression on non-linear model parameters in brms brms	16	1551	September 20, 2018
Incorporating Spatial Weights Matrix brms	10	2292	April 24, 2020

Bym and besag models in brms

Related topics