Hello!
I’m new to Bayesian stats and I need your help.
I have the following model:
fit <- brm(response ~ 0 + Intercept + contrast + language + contrast:language + (1 | subject/contrast), data = test, family = bernoulli("logit"), prior = prior, sample_prior = "yes")
Population-Level Effects:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept -0.28 0.29 -0.85 0.29 1.00 1466 2173
contrasthead_heed 5.36 0.68 4.10 6.82 1.00 2547 2650
contrasthid_heed -0.28 0.34 -0.94 0.37 1.00 1958 2391
contrasthoard_hod 0.89 0.34 0.21 1.53 1.00 2027 2894
contrasthud_hard -0.11 0.32 -0.75 0.51 1.00 2012 2112
contrastwhod_hood -0.21 0.32 -0.86 0.42 1.00 1942 2414
languageRP 1.77 0.53 0.76 2.84 1.00 1583 2325
languageSMG -1.05 0.41 -1.92 -0.30 1.00 1624 2109
contrasthead_heed:languageRP -0.65 1.03 -2.85 1.21 1.00 4212 2337
contrasthid_heed:languageRP 1.42 0.69 0.14 2.83 1.00 2276 3005
contrasthoard_hod:languageRP -0.49 0.57 -1.66 0.61 1.00 2672 3129
contrasthud_hard:languageRP -0.20 0.54 -1.28 0.86 1.00 2464 3191
contrastwhod_hood:languageRP 0.16 0.56 -0.90 1.30 1.00 2539 3005
contrasthead_heed:languageSMG 0.61 0.80 -0.86 2.35 1.00 3153 2558
contrasthid_heed:languageSMG -0.51 0.50 -1.48 0.44 1.00 2487 2492
contrasthoard_hod:languageSMG 0.18 0.48 -0.73 1.11 1.00 2387 2674
contrasthud_hard:languageSMG 0.74 0.47 -0.15 1.69 1.00 2428 3075
contrastwhod_hood:languageSMG 0.05 0.45 -0.82 0.93 1.00 2398 2961
- I want to find if there are significant differences between a particular RP contrast and the same SMG contrast (e.g., SMG hid-heed from RP hid-heed). Is the below code correct:
hyp <- c(hyp1 = "languageSMG + contrasthid_heed:languageSMG = languageRP + contrasthid_heed:languageRP ")
- I want to find if there are significant differences between two different contrasts in the same language (e.g., hid-heed from head-heed in SMG). Is the below code correct?
hyp <- c(hyp1 = " contrasthid_heed + contrasthid_heed:languageSMG = contrasthead_heed + contrasthead_heed:languageSMG")
Your help would be much appreciated!
George
Hi Andy, any ideas on that? @andymilne
As far as I can see, the hypothesis arguments correspond to what you want. Always worth getting a sanity check, by confirming the estimates given by the hypothesis function correspond to those shown when running conditional_effects()
on the model (made easier if you use the method = "posterior_linpred"
argument in conditional_effects()
so the plot’s y-axis has the same scale as the model’s coefficients).
1 Like
Thank you.
Here, I use point-null hypotheses. However, I’m a little bit concerned because I don’t find strong evidence for a difference between two contrasts, but when I use directional hypotheses, I find strong evidence that one contrast is discriminated more accurately than the other (for the same pair of contrasts).
I personally prefer directional hypotheses and my results are fine using these, but a reviewer of a high-indexed journal has told me that null-point are more reliable than directional and I must change my stats.
How would you cope with that?
I don’t believe point-null hypothesis tests (e.g., Bayes factors or their approximations like the Savage-Dickey test which brms uses) make any sense in these sorts of experiments because the probability a contrast or effect of interest is precisely zero is zero. A reasonable and meaningful alternative is a ROPE test, which gives the probability a contrast is within a range of sizes that can be meaningfully considered negligible. I would argue for that type of test.
Thank you again.
And my last question. I used LOO validation with the below models:
fit1 <- brm(response ~ 0 + Intercept + contrast + language + contrast:language + (1 | subject), data = test, family = bernoulli("logit"), prior = prior, sample_prior = "yes")
fit2 <- brm(response ~ 0 + Intercept + contrast + language + contrast:language + (1 | subject) + (1 | contrast), data = test, family = bernoulli("logit"), prior = prior, sample_prior = "yes")
fit3 <- brm(response ~ 0 + Intercept + contrast + language + contrast:language + (1 | subject) + (1 | language), data = test, family = bernoulli("logit"), prior = prior, sample_prior = "yes")
fit4 <- brm(response ~ 0 + Intercept + contrast + language + contrast:language + (1 | subject) + (1 |contrast) + (1 |language), data = test, family = bernoulli("logit"), prior = prior, sample_prior = "yes")
elpd_diff se_diff
fit4 0.0 0.0
fit1 0.0 0.3
fit2 -0.3 0.3
fit3 -0.5 0.2
Do you think that the random factors I used are OK? Any suggestions?
The second and fourth models have contrast
as both a fixed effect and as a random grouping factor (the thing to the right of the | symbol), which doesn’t make sense because you are estimating it once as an unpooled variable and then again with partial pooling. Same problem with language
in the third and fourth models. That explains why the LOO values are essentially identical. The model in your first post avoids this problem – the nested random effects (1 | subject/contrast)
equates to (1 | subject) + (1 | subject:contrast)
and neither subject
nor subject:contrast
are fixed effects in that model.
So, would the below make sense?
fit1 <- brm(response ~ 0 + Intercept + contrast + language + contrast:language + (1 | subject), data =
test, family = bernoulli("logit"), prior = prior, sample_prior = "yes")
fit2 <- brm(response ~ 0 + Intercept + contrast + language + contrast:language + (1 | subject/contrast),
data = test, family = bernoulli("logit"), prior = prior, sample_prior = "yes")
fit3 <- brm(response ~ 0 + Intercept + contrast + language + contrast:language + (1 | subject/language),
data = test, family = bernoulli("logit"), prior = prior, sample_prior = "yes")
fit4 <- brm(response ~ 0 + Intercept + contrast + language + contrast:language + (1 |
subject/contrast/language), data = test, family = bernoulli("logit"), prior = prior, sample_prior = "yes")
elpd_diff se_diff
fit2 0.0 0.0
fit4 -0.8 0.5
fit3 -71.4 11.7
fit1 -71.5 11.8
I don’t know how the term (1 | subject/contrast/language)
is parsed in R. If it is equivalent to (1 | subject) + (1 | subject:contrast) + (1 | subject:contrast:language)
, all the above formulas make sense because the specified grouping factors never appear as a fixed effect in the same formula. Also ensure that the data is nested as specified, the model passes the usual diagnostics (Rhat, ESS, etc), and there are several observations at each level of the grouping factor. For example, is there more than one observation at each level of subject:contrast:language
? If not, there is no good reason to include that term.
Great, this is useful! Thank you once again!