Hi everyone,
This is my first time posting in the forums, I apologize in advance if I made any formatting mistakes. I have my data and models stored in an .Rdata file, but it is 616mb, so I can’t upload it. If anyone tells me where I should upload it, I will do it. Here’s my code brms_forum.R (2.2 KB). EDIT: I am using R 4.1.0, brms 2.17.0, cmdstanr 0.4.0.9000
I am modeling participation in a citizen science project that studies disease vectors based on a set of covariates at the census tract level: average household income, average presence of the species (mosquito, predominant type of land cover (residential, green/leisure/other) , population density, weekend (dummy), month (dummy) and year (dummy).
In my analysis, I compare two methods of accounting for the spatial autocorrelation of my outcome variable (presence or absence of a participation–logit):
- Multilevel modeling with random intercepts for each census tract.
prior <- get_prior(presence ~ poly(inc,2) + poly(bg_count, 2) + landcover + poly(popd, 2) + weekend + month + year + (1 | CUSEC), data = D, family = bernoulli(link = "logit"), chains = n_chains, cores = n_chains, backend = "cmdstanr", threads = n_threads, iter = 8000)
prior$prior[1] <- "normal(0,5)"
m4 <- brm(presence ~ poly(inc,2) + poly(bg_count, 2) + landcover + poly(popd, 2) + weekend + month + year + (1 | CUSEC), data = D, family = bernoulli(link = "logit"), chains = n_chains, cores = n_chains, backend = "cmdstanr", threads = n_threads, iter = 8000)
- ICAR models with queen autocorrelation matrix among census tracts:
prior <- get_prior(presence ~ poly(inc, 2) + poly(bg_count, 2) + landcover + poly(popd, 2) + weekend + month + year + (1 | CUSEC) + car(queens, gr=CUSEC, type="icar"), data = D, data2 = list(queens = queens), family = bernoulli(link = "logit"), chains=n_chains, cores=n_chains, backend = "cmdstanr", threads = n_threads, iter = 8000)
prior$prior[1] <- "normal(0,5)"
car2 = brm(presence ~ poly(inc, 2) + poly(bg_count, 2) + landcover + poly(popd, 2) + weekend + month + year + (1 | CUSEC) + car(queens, gr=CUSEC, type="icar"), data = D, data2 = list(queens = queens), family = bernoulli(link = "logit"), chains=n_chains, cores=n_chains, backend = "cmdstanr", threads = n_threads, iter = 8000)
In my MLM model (m4), the effect of my main independent variable of interest (inc–average household income of the census tract), is negative (-10.80, std. 9.53) although 95% credible interval goes through zero.
However, when I calculate the predicted probability of my outcome variable based on this variable:
plot(conditional_effects(m4))
The effect has an inverted-U shape.
Note that, for the rest of the variables, the conditional effects go in the same direction as the estimates provided by the summary function. Is this an error?
The second question is about my ICAR model.
How should I interpret the correlation structure coefficient? I was reading it as a similar concept to intraclass correlation, where the estimate tells you whether there are differences across your spatial units, but I am not entirely sure.
Thank you for your time,
Álvaro.