Error when fitting a CAR model in brms

I was playing around with fitting CAR models, following the example in ?car. I fit the model that is in that example:

fit <- brm(y | trials(size) ~ x1 + x2 + car(W), 
           data = dat, data2 = list(W = W),
           family = binomial(),
           iter = 5000) 

The model fits, but I get a warning message:

Using CAR terms without a grouping factor is deprecated. Please use argument 'gr' even if each observation represents its own location.

Seems like an easy fix, but I can’t figure out how to correctly specify the gr argument in car . For example, I tried the following:

rownames(W) <- 1:100
fitg <- brm(y | trials(size) ~ x1 + x2 + car(W, gr = factor(1:100)), 
           data = dat, data2 = list(W=W),
           family = binomial(),
           iter = 5000) 

And I get the following error message:

Error: The following variables are missing in 'data': 'W'

Every variation I try results in similar issues where the function can no longer find different aspects of the data. What am I doing wrong?

Never mind, I figured out what was wrong. The gr has to be in the data to map it to the adjacency matrix.

On a separate note, I just fit a CAR model that had one cell (grid location) for which there was no data, and I wanted to see if there was a way to make predictions for the “missing” cell (using posterior_epred). I would have thought that if I provided the location information for the missing cell (in the original adjacency matrix that was used to fit the model), and set allow_new_levels to TRUE, that this would be enough to estimate these cells. However, I got an error message:

Error: Cannot handle new locations in CAR models.

Is this a feature that can be added or is there a conceptual reason why this is not possible that I am missing?

Here is the code that I used to fit the model (based on the example under ?car):

idx <- 25
rownames(W) <- 1:K
dat_complete <- mutate(dat, Var1 = Grid$Var1, Var2 = Grid$Var2, grp = factor(1:K))
dat_miss <- dat_complete[-idx,]

fit_miss <- brm(y | trials(size) ~ x1 + x2 + car(W, gr = grp), 
           data = dat_miss, data2 = list(W = W),
           family = binomial(),
           iter = 5000) 

And here is the code to try to get the predictions, which throws the error:

est = posterior_epred(fit_miss, newdata = dat_complete,
                               allow_new_levels = TRUE)

I am using brms version 2.13.3.

1 Like

I believe there was some discussion about this issue in https://github.com/paul-buerkner/brms/issues/637

Thanks, that’s helpful. I’ll check out the paper linked in that thread.

As a workaround, could one use the brms weight argument to add one made-up datapoint (i.e., population mean) per new location, and then apply zero weights to said datapoints? This could be very wrong, but it feels so right.

Right that is actually a good idea. One could add a data point which is has a missing value in its response and then handle it via the mi() addition term.

Using the example above, we can set the value of the outcome at index 25 to be NA.

dat_miss2 <- dat_complete
dat_miss2$y[25] <- NA

Can you clarify how to specify the model? I’m new to the mi syntax and the only examples I see are for missing values in the predictor. I attempted the following but it didn’t work:

fit_miss2 <- brm(mi(y) | trials(size) ~ x1 + x2 + car(W, gr = grp), 
                 data = dat_miss2, data2 = list(W = W),
                 family = binomial(),
                 iter = 5000)

Ok it would have worked for continues but not for discrete outcomes so it won’t for your case. The weight = 0 approach should still work though but don’t use NA in this case to avoid having this observation removed.

That would be like this, right?

dat_miss2 <- dat_complete %>%
  mutate(wt = if_else(grp == "25", 0, 1))

fit_miss2 <- brm(y | trials(size) + weights(wt) ~ x1 + x2 + car(W, gr = grp), 
                 data = dat_miss2, data2 = list(W = W),
                 family = binomial(),
                 iter = 5000) 

I get a bunch of uninterpretable compilation errors:

Compiling Stan program...
running command 'C:/PROGRA~1/MICROS~3/ROPEN~1/R-35~1.1/bin/x64/R CMD SHLIB file1f8c71486e.cpp 2> file1f8c71486e.cpp.err.txt' had status 1Error in compileCode(f, code, language = language, verbose = verbose) : 
  Compilation ERROR, function(s)/method(s) not created! g++.exe: error: Files/Microsoft/R: No such file or directory

For what its worth, I just ran the following (based off your code) and it worked OK, but I’m still using brms 2.12.0 (hence “fitted” at the end).

library(brms)
library(dplyr)

## Not run: 
# generate some spatial data
east <- north <- 1:10
Grid <- expand.grid(east, north)
K <- nrow(Grid)

# set up distance and neighbourhood matrices
distance <- as.matrix(dist(Grid))
W <- array(0, c(K, K))
W[distance == 1] <- 1 	

# generate the covariates and response data
x1 <- rnorm(K)
x2 <- rnorm(K)
theta <- rnorm(K, sd = 0.05)
phi <- rmulti_normal(
  1, mu = rep(0, K), Sigma = 0.4 * exp(-0.1 * distance)
)
eta <- x1 + x2 + phi
prob <- exp(eta) / (1 + exp(eta))
size <- rep(50, K)
y <- rbinom(n = K, size = size, prob = prob)
dat <- data.frame(y, size, x1, x2)


idx <- 25
rownames(W) <- 1:K

dat_complete <- mutate(dat, Var1 = Grid$Var1, Var2 = Grid$Var2, grp = factor(1:K))
dat_miss <- dat_complete[-idx,]

dat_miss2 <- dat_complete %>%
  mutate(wt = if_else(grp == "25", 0, 1))

fit_miss <- brm(y | trials(size) +  weights(wt) ~ x1 + x2 + car(W, gr = grp), 
                data = dat_miss2, data2 = list(W = W),
                family = binomial(),
                iter = 5000) 

est <- fitted(fit_miss, 
              newdata = dat_complete,
              allow_new_levels = TRUE,
              summary = TRUE)

I just got it to work also; not sure what was wrong before. Thanks to both of you for your help!

Hi @franzsf , @jgellar and @paul.buerkner. Thank you for your examples. There is something that is not clear for me. I understand that the weights argument and “wt” variable allow me to fit the model without considering that especific observation, but if that observation has an NA as response, the fitted function result in the same error:

Error: Cannot handle new locations in CAR models.

If I use any value insted of NA, still I will be restricted by the size argument, right?

Even without this problems, do you think that this method can be use to execute a K-fold cross-validation like in this vignette?

Do you know how to do it using brms?

Thank you,

The error can be avoided by allowing missing value in the responses (if continuous). See the example in this paper: Efficient leave-one-out cross-validation for Bayesian non-factorized normal and Student-t models | SpringerLink

Then, at least loo should be programatically possible. K-fold might complain (for code not conceptual reason) but I am not entirely sure right now.

Thank you Paul. I already tried that but I get this:

Error: Argument 'mi' is not supported for family 'multinomial(logit)'.

I will try to change the model in a way that a Poisson likelihood can be used.

also poisson won’t worn unfortunately. only continues likelihoods are compatible with mi at this point.

Sorry, I didn’t pay attention to that part. Thank you for your response.