Hello,
I am trying out brms for the first time, coming from an lme4 background in R. I have two questions about my model, which will be indicated in brackets []. The data are from a free recall task, where participants listen to lists of 10 words and then have to recall as many as possible. The response variable is whether or not the participant recalled the word (1 or 0).
For this model, I am predicting the recall by the word’s position in the list that was presented: people should remember the early and late words better than the middle words, so I model both position and position squared. So-called “random” effects include the word’s frequency (how common it is), its phonological density (how many other words there are that sound like it), and the participant’s ID.
From a previous experiment using the same word set, I know that people can recall around 57% of the words presented to them, with a standard deviation of 19%. I’d like to include this as a prior for the intercept, but since it is a logistic model, I am not sure whether I should take the logit of the prior when I define it, or whether brms is doing that in the background [QUESTION 1]. Currently, I have this model:
mod.logitBayesRecall0 <- brm(formula = recalled ~ 1 +
(1 | logFreq) + (1 | propDensity) +
(1 | ID) + poly(position, 2),
data=wordData,
family = bernoulli(link = "logit"),
prior = c(set_prior("normal(0.57, 0.19)", class = "Intercept")),
warmup = 500,
iter = 2000,
chains = 2,
inits= "0",
seed = 123,
file = "./modelcache/mod.logitBayesRecall0")
The model runs and seems to converge well:
Family: bernoulli
Links: mu = logit
Formula: recalled ~ 1 + (1 | logFreq) + (1 | propDensity) + (1 | ID) + poly(position, 2)
Data: wordData (Number of observations: 18000)
Samples: 2 chains, each with iter = 2000; warmup = 500; thin = 1;
total post-warmup samples = 3000
Group-Level Effects:
~logFreq (Number of levels: 223)
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept) 0.61 0.04 0.54 0.69 1.00 866 1449
~ID (Number of levels: 75)
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept) 0.73 0.06 0.62 0.87 1.00 479 859
~propDensity (Number of levels: 50)
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept) 0.43 0.07 0.30 0.58 1.00 820 1394
Population-Level Effects:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept 0.52 0.10 0.32 0.71 1.00 484 1212
polyposition21 43.72 5.75 31.99 54.84 1.00 1218 1692
polyposition22 54.58 5.13 44.26 64.58 1.00 1635 2152
Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
However, I am not sure about how to interpret the estimates, especially for position [QUESTION 2]. I guess I would need to apply an anti-logit/expit function, i.e.
antilogit <- function(x) {1 / (1 + exp(-x))}
… but when I do this for polyposition12 and polyposition23 I get a value that rounds to 1, and I am not sure how to plug in the numbers and get reasonable predictions analytically (e.g. for predicting recall at position 5).
Other checks: the conditional_effects() function shows a reasonable plot reflecting the pattern I expect:
pp_check() gives this plot, which I am less sure about:
Is this model specified correctly, especially with the custom prior? How do I interpret the coefficients? Thanks in advance.
- Operating System: MacOS Catalina
- brms Version: