[I posted this question already on Stack Overflow, but was advised I might get more response if posted here]
I am learning Bayesian regression modelling using the brms package in R.
I am modelling rates of infection per head of population at the neighbourhood level, and investigating associations with other neighbourhood-level covariates such as poverty level and distance to a health centre.
For example:
library(tidyverse)
library(brms)
set.seed(1234)
df <- tibble(neighbourhood = seq(1:200),
cases = rpois(n = 200, lambda = 3),
population = round(runif(n = 200, min = 100, max = 10000)),
poverty = round(runif(n = 200, min = 30, max = 90)),
distance = runif(n = 200, min = 20, max = 10000))
Using this made up dataset, I can construct a Bayesian regression model (my real model is more complex, with spatial autocorrelation terms and other covariates).
Note the offset(log(population)) term to allow adjustment for neighbourhood population size.
prior <- c(prior_string("normal(0,10)", class="b"),
prior_(~normal(0,10), class= ~Intercept))
m1 <- brm(bf(cases ~
poverty +
log(distance) +
offset(log(population))),
data=df,
family='poisson',
prior = prior,
iter=4000, warmup=1000,
chains=3,
seed=1234,
control=list(adapt_delta=0.95))
summary(m1)
Family: poisson
Links: mu = log
Formula: cases ~ poverty + log(distance) + offset(log(population))
Data: df (Number of observations: 200)
Samples: 3 chains, each with iter = 4000; warmup = 1000; thin = 1;
total post-warmup samples = 9000
Population-Level Effects:
Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
Intercept -7.40 0.35 -8.08 -6.71 6591 1.00
poverty 0.01 0.00 0.00 0.01 9000 1.00
logdistance -0.04 0.04 -0.12 0.03 5936 1.00
Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample
is a crude measure of effective sample size, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
I can plot the marginal effects of the poverty and log(distance) variables by running
marginal_effects(m1)
I understand that the marginal effects plots are estimated at the mean value of other covariates within the model.
What I am really interested in however, is plotting the number of cases as a function of population size, at the mean levels of distance and poverty.
Even better would be plots of rates of infection per head of population as functions of distance and poverty.
Not sure if what I am looking for either a) makes sense, or b) is possible with brms… but very grateful for any suggestions.
- Operating System: macOS High Sierra, Version 10.13.3
- brms Version: 2.3.1