Salut Stanners,
PS: apologies for how long this is. I hope some others trying to learn notation may profit.
I have fit a suite of models in brms
linking hunter motivation (sale/consumption/both), behaviour (distance walked, ammo taken,etc.), and offtake (animals hunted, biomass, money earned). I used @jonah’s et al.'s ground up strategy and graphical tools, model fit and posterior predictions all peachy. My question here is on writing my models in mathematical notation.
I’ve little experience in this, and looked to both @solomon’s brms
translation of McElreath’s 2nd edition of Statistical Rethinking and this paper from @Ladislas and others with @paul.buerkner. They seem to use different notation approaches for similar (I think) models, and I drew on parts from both that made most sense to me.
I have three models. Below I show their brms
syntax to fit, prior_summary()
after fitting, and my attempted notation. (I don’t include input data as I don’t think it’s necessary? Notes on structure of inputs and the model at bottom of post).
My main question on the notation is how to write up the correlations for multiple predictors with varying slopes (model 2) and mv responses (model 3). But I’m sure I’ve made many errors and would be ineffably grateful for any and all insight and corrections!
First model: GLMM to identify meaningful hunter behaviours, pooled (varying intercepts) by village (n= 9) and hunter (n = 106). Note that I model two responses, all animals and threatened animals. The latter is a subset of the former, so this isn’t really an mv response and I model no response correlations; I just use the mv syntax for more compact fit code and objects.
##formula
##double_zero_brought = small ammo; chevrotine_brought = large ammo
gunform <- bf(RESPONSE ~ a + b + c + d,
a ~ 0 + (1 | village) + (1 | hunter),
b ~ 0 + season,
c ~ 0 + nightday,
d ~ 0 + weather_impute +
max_km_village_CS +
double_zero_brought_CS + chevrotine_brought_CS +
porters_CS + other_hunters_CS,
nl = TRUE)
formv_ga <- update(gunform, animals ~ .) + poisson()
formv_gt <- update(gunform, threatened ~ .) + poisson()
##prior
priormv_g <- c(
#fixed coefficients (b)
prior(normal(0,0.5), class = b, nlpar = "b", resp = "animals"),
prior(normal(0,0.5), class = b, nlpar = "c", resp = "animals"),
prior(normal(0,0.5), class = b, nlpar = "d", resp = "animals"),
prior(normal(0,0.5), class = b, nlpar = "b", resp = "threatened"),
prior(normal(0,0.5), class = b, nlpar = "c", resp = "threatened"),
prior(normal(0,0.5), class = b, nlpar = "d", resp = "threatened"),
#varying intercepts (sd)
prior(exponential(2), class = sd, nlpar = "a", resp = "animals"),
prior(exponential(2), class = sd, nlpar = "a", resp = "threatened")
)
##fit
gmv <- brm(formv_ga + formv_gt + set_rescor(FALSE),
prior = priormv_g,
data = gun48_mv,
chains = 4, iter = 3000, warmup = 1500,
control = list(adapt_delta = 0.92),
cores = getOption("mc.cores", 2),
file = "./outputs/model_fits/gmv")
prior_summary(gmv)
prior class coef group resp dpar nlpar bound source
normal(0, 0.5) b animals b user
normal(0, 0.5) b seasondry animals b (vectorized)
normal(0, 0.5) b seasonwet animals b (vectorized)
normal(0, 0.5) b animals c user
normal(0, 0.5) b nightdayboth animals c (vectorized)
normal(0, 0.5) b nightdayday animals c (vectorized)
normal(0, 0.5) b nightdaynight animals c (vectorized)
normal(0, 0.5) b animals d user
normal(0, 0.5) b chevrotine_brought_CS animals d (vectorized)
normal(0, 0.5) b double_zero_brought_CS animals d (vectorized)
normal(0, 0.5) b max_km_village_CS animals d (vectorized)
normal(0, 0.5) b other_hunters_CS animals d (vectorized)
normal(0, 0.5) b porters_CS animals d (vectorized)
normal(0, 0.5) b weather_impute animals d (vectorized)
normal(0, 0.5) b threatened b user
normal(0, 0.5) b seasondry threatened b (vectorized)
normal(0, 0.5) b seasonwet threatened b (vectorized)
normal(0, 0.5) b threatened c user
normal(0, 0.5) b nightdayboth threatened c (vectorized)
normal(0, 0.5) b nightdayday threatened c (vectorized)
normal(0, 0.5) b nightdaynight threatened c (vectorized)
normal(0, 0.5) b threatened d user
normal(0, 0.5) b chevrotine_brought_CS threatened d (vectorized)
normal(0, 0.5) b double_zero_brought_CS threatened d (vectorized)
normal(0, 0.5) b max_km_village_CS threatened d (vectorized)
normal(0, 0.5) b other_hunters_CS threatened d (vectorized)
normal(0, 0.5) b porters_CS threatened d (vectorized)
normal(0, 0.5) b weather_impute threatened d (vectorized)
exponential(2) sd animals a user
exponential(2) sd threatened a user
exponential(2) sd hunter animals a (vectorized)
exponential(2) sd Intercept hunter animals a (vectorized)
exponential(2) sd hunter threatened a (vectorized)
exponential(2) sd Intercept hunter threatened a (vectorized)
exponential(2) sd village animals a (vectorized)
exponential(2) sd Intercept village animals a (vectorized)
exponential(2) sd village threatened a (vectorized)
exponential(2) sd Intercept village threatened a (vectorized)
Second model: GLMM of meaningful behaviours with varying slopes by village.
##formula
ga_vs <- bf(animals ~ 0 + (1 | village) + (1 | hunter) +
(0 + max_km_village_CS + double_zero_brought_CS | village)
) + poisson()
##varying slope priors
prior_vs <- c(
prior(exponential(2), class = sd), #for village and hunter
prior(lkj(1), class = cor) #for all intercept + slope correlations
)
##fit
gvs <- brm(ga_vs,
prior = prior_vs,
data = gun48_mv,
chains = 4, iter = 3000, warmup = 1500,
control = list(adapt_delta = 0.99),
cores = getOption("mc.cores", 2),
file = "./outputs/model_fits/gvs")
prior_summary(gvs)
prior class coef group resp dpar nlpar bound source
lkj_corr_cholesky(1) L user
lkj_corr_cholesky(1) L village (vectorized)
exponential(2) sd user
exponential(2) sd hunter (vectorized)
exponential(2) sd Intercept hunter (vectorized)
exponential(2) sd village (vectorized)
exponential(2) sd double_zero_brought_CS village (vectorized)
exponential(2) sd Intercept village (vectorized)
exponential(2) sd max_km_village_CS village (vectorized)
Third model: linking motivation, meaningful behaviors, and offtake through a multivariate generalized linear mixed piecewise structural equation model (mvGLMpwSEM?..lol). The mv part is for behaviours in response to motivation, within hunters. Before the syntax here’s a DAGish diagram that may clarify (ignore the mediation bits, the full model uses partial).
wh = why hunt , km = max km village , am = small ammo brought on hunt, ? is an unknown mechanism not observed or estimated, and an = animals hunted . For successful hunts ( an > 1, grey circles) kg = biomass, ss = probability of some bushmeat being sold, and as = probability of all bushmeat being sold. For hunts with some bushmeat sold (dark grey circle) mo = money earned. h refers to a single hunt, the unit of observation.
##sem formulas for gun fluid hunters
##animals
ga_mb <- bf(animals ~ 0 + (1 | village) + (1 |hunter) +
why_hunt + max_km_village_CS + double_zero_brought_CS
) + poisson()
ga_m <- bf(animals ~ 0 + (1 | village) + (1 |hunter) +
why_hunt
) + poisson()
ga_b <- bf(animals ~ 0 + (1 | village) + (1 |hunter) +
max_km_village_CS + double_zero_brought_CS
) + poisson()
##behavior
gmkv_m <- bf(max_km_village ~ 0 + (1 | village) + (1 | ch | hunter) +
why_hunt
) + Gamma(link = "log")
gdzb_m <- bf(double_zero_brought ~ 0 + (1 | village) + (1 | ch | hunter) +
why_hunt
) + poisson()
##kg, some sold, all sold, and money (for succesful hunts only)
gkg <- bf(lkg | subset(success) ~ 0 + (1 | village) + (1 | hunter) +
animals_CS + max_km_village_CS
) + gaussian()
gss <- bf(ssold | subset(success) ~ 0 + (1 | village) + (1 | hunter) +
kg_CS + why_hunt
) + bernoulli()
gas <- bf(asold | subset(success) ~ 0 + (1 | village) + (1 | hunter) +
kg_CS + why_hunt
) + bernoulli()
gmo <- bf(lmoney | subset(ssold) ~ 0 + (1 | village) + (1 | hunter) +
kg_CS + why_hunt
) + gaussian()
##set sem priors
##simple prior of just beta and sd for poisson-only models
prior_bsd <- c(
prior(normal(0,0.5), class = b), #for all predictors
prior(exponential(2), class = sd) #for village and hunter
)
##prior for full and parital mediation
#alpha reps skewness (< 0 left, > 0 right)
#sigma is a "Vector of standard deviation values."
prior_gmed <- c(
#fixed coefficents (b)
prior(normal(0,0.5), class = b, resp = "doublezerobrought"),
prior(normal(0,0.5), class = b, resp = "maxkmvillage"),
prior(normal(0,0.5), class = b, resp = "animals"),
#varying intercepts (sd)
prior(exponential(2), class = sd, resp = "doublezerobrought"),
prior(exponential(2), class = sd, resp = "maxkmvillage"),
prior(exponential(2), class = sd, resp = "animals"),
# #skew normals (alpha and sigma)
# prior(normal(0,0.5), class = alpha, resp = maxkmvillage),
# prior(exponential(2), class = sigma, resp = maxkmvillage),
# prior(normal(0,0.5), class = alpha, resp = doublezerobrought),
# prior(exponential(2), class = sigma, resp = doublezerobrought),
#shape for the gamma
prior(gamma(4,0.8), class = shape, resp = "maxkmvillage"),
#correlations (cor) between ammo and km
prior(lkj(1), class = cor)
)
##prior for full sem
prior_gsem <- bind_rows(prior_gmed,
tibble(prior = c(rep("normal(0,0.5)",4), rep("exponential(2)",6)),
class = c(rep("b", 4), rep("sd", 4), rep("sigma", 2)),
coef = "", group = "",
resp = c(rep(c("lkg", "lmoney", "ssold", "asold"),2), "lkg", "lmoney"),
dpar = "", nlpar = "", bound = "", source = "user"
)
)
##fit
gsemdo <- brm(ga_mb + gmkv_m + gdzb_m +
gkg + gss + gas + gmo + set_rescor(FALSE),
prior = prior_gsem,
data = gun48_semdo,
chains = 4, iter = 3000, warmup = 1500,
control = list(adapt_delta = 0.99),
cores = getOption("mc.cores", 2),
file = "./outputs/model_fits/gsemdo")
prior_summary(gsemdo)
prior class coef group resp dpar nlpar bound source
normal(0, 0.5) b animals user
normal(0, 0.5) b double_zero_brought_CS animals (vectorized)
normal(0, 0.5) b max_km_village_CS animals (vectorized)
normal(0, 0.5) b why_huntboth animals (vectorized)
normal(0, 0.5) b why_hunteat animals (vectorized)
normal(0, 0.5) b why_huntsell animals (vectorized)
normal(0,0.5) b asold user
normal(0,0.5) b kg_CS asold (vectorized)
normal(0,0.5) b why_huntboth asold (vectorized)
normal(0,0.5) b why_hunteat asold (vectorized)
normal(0,0.5) b why_huntsell asold (vectorized)
normal(0, 0.5) b doublezerobrought user
normal(0, 0.5) b why_huntboth doublezerobrought (vectorized)
normal(0, 0.5) b why_hunteat doublezerobrought (vectorized)
normal(0, 0.5) b why_huntsell doublezerobrought (vectorized)
normal(0,0.5) b lkg user
normal(0,0.5) b animals_CS lkg (vectorized)
normal(0,0.5) b max_km_village_CS lkg (vectorized)
normal(0,0.5) b lmoney user
normal(0,0.5) b kg_CS lmoney (vectorized)
normal(0,0.5) b why_huntboth lmoney (vectorized)
normal(0,0.5) b why_hunteat lmoney (vectorized)
normal(0,0.5) b why_huntsell lmoney (vectorized)
normal(0, 0.5) b maxkmvillage user
normal(0, 0.5) b why_huntboth maxkmvillage (vectorized)
normal(0, 0.5) b why_hunteat maxkmvillage (vectorized)
normal(0, 0.5) b why_huntsell maxkmvillage (vectorized)
normal(0,0.5) b ssold user
normal(0,0.5) b kg_CS ssold (vectorized)
normal(0,0.5) b why_huntboth ssold (vectorized)
normal(0,0.5) b why_hunteat ssold (vectorized)
normal(0,0.5) b why_huntsell ssold (vectorized)
lkj_corr_cholesky(1) L user
lkj_corr_cholesky(1) L hunter (vectorized)
exponential(2) sd animals user
exponential(2) sd asold user
exponential(2) sd doublezerobrought user
exponential(2) sd lkg user
exponential(2) sd lmoney user
exponential(2) sd maxkmvillage user
exponential(2) sd ssold user
exponential(2) sd hunter animals (vectorized)
exponential(2) sd Intercept hunter animals (vectorized)
exponential(2) sd hunter asold (vectorized)
exponential(2) sd Intercept hunter asold (vectorized)
exponential(2) sd hunter doublezerobrought (vectorized)
exponential(2) sd Intercept hunter doublezerobrought (vectorized)
exponential(2) sd hunter lkg (vectorized)
exponential(2) sd Intercept hunter lkg (vectorized)
exponential(2) sd hunter lmoney (vectorized)
exponential(2) sd Intercept hunter lmoney (vectorized)
exponential(2) sd hunter maxkmvillage (vectorized)
exponential(2) sd Intercept hunter maxkmvillage (vectorized)
exponential(2) sd hunter ssold (vectorized)
exponential(2) sd Intercept hunter ssold (vectorized)
exponential(2) sd village animals (vectorized)
exponential(2) sd Intercept village animals (vectorized)
exponential(2) sd village asold (vectorized)
exponential(2) sd Intercept village asold (vectorized)
exponential(2) sd village doublezerobrought (vectorized)
exponential(2) sd Intercept village doublezerobrought (vectorized)
exponential(2) sd village lkg (vectorized)
exponential(2) sd Intercept village lkg (vectorized)
exponential(2) sd village lmoney (vectorized)
exponential(2) sd Intercept village lmoney (vectorized)
exponential(2) sd village maxkmvillage (vectorized)
exponential(2) sd Intercept village maxkmvillage (vectorized)
exponential(2) sd village ssold (vectorized)
exponential(2) sd Intercept village ssold (vectorized)
gamma(4, 0.8) shape maxkmvillage user
exponential(2) sigma lkg user
exponential(2) sigma lmoney user
Notes on model structure
I model count responses using a Poisson rather than a gamma-Poisson (negative binomial) distribution. My reasons from fit models: 1) GP estimates more extreme/unrealistic max values and overestimates dispersion, 2) the shape value of GP is very high, indicating similarity to a pure Poisson process, and 3) variance is already made explicit via nesting by village and hunter. Parameter estimation is the same using both distributions.
I exclude a global intercept, because I want to make predictions at the village-level. Re: varying intercepts, a unique hunter is nested within a unique village, so technically this is hierarchical, but I don’t specify it as such in the model fit, because I want hunters to be compared to the total pool across villages (a skilled hunter in village A is more similar to a skilled hunter in village B than he is to a poor hunter in village A). Regardless, this shouldn’t tangibly change inference (see Rethinking 13.3.0.1).
I format all categorical predictors as index variables, because I am interested in individual categories rather than in comparison to a reference. The one exception is weather (rainy or not during the hunt) which I format as a dummy/binary/indicator variable, because hunts on rainy days are scarce and can be seen as differences from the norm. Models showed that rain does not have a strong direct effect on offtake, though it does of course influence your decision to go hunting in the first place and could further change your behavior (and thus offtake) once hunting.
I center (subtract by the mean) and scale (divide by the standard deviation) all continuous predictors to facilitate model convergence and comparison of effects across predictors. Variables that act as both responses and predictors in the pwSEM are in their original values as responses (or logged for kg and money) and centered and scales as predictors.