I have a dataset in which I’m interested in whether responses in condition b are more accurate than in condition a.
I’m also interested in whether they are higher than chance, which is not 50%/logit(.5) but rather n where n is a probability. So I want to find out my certainty about an across-condition difference and also whether a condition excludes n.
I’ll illustrate this working with the iris dataset in R and a linear model. Here, n can be any number and there is no transformation involved but the problem remains the same.
I have two questions:
Q1: Does sepal length differ significantly between setosa (the intercept) and versicolor?
Q2: Does the range of sepal length for species: versicolor exclude n=7?
For Q1, I can fit
lm1 = lm(Sepal.Length ~ 1 + Species, data = iris)
and then get confidence intervals for the predictor speciesversicolor
: [0.73;1.13]. So yes, there is a significant difference here with p<0.05.
For Q2, I can fit
lm2 = lm(Sepal.Length ~ 0 + Species, data = iris)
and get confidence intervals for each level of species and see whether versicolor excludes 7. It does ([5.79;6.08]). I could also relevel the factor and refit the model but let’s say I want this for each level of species.
I’m new to Bayesian modelling. I feel like the answer to Q2 is available in the model:
stan_lm1 = stan_glm(Sepal.Length ~ 1 + Species, data = iris)
I mean there is no significance level but I can get a posterior interval for Species: versicolor and see if it includes 7. Not sure how to do that though. The posterior draws are about the difference from the intercept. brms::hypothesis
can test ‘Speciesversicolor < 0’ but not, it seems, ‘Speciesversicolor < 7’.
I can of course fit
stan_lm2 = stan_glm(Sepal.Length ~ 0 + Species, data = d)
and use that one to test Q1 as well, so I can have 1 model instead of 2, but it feels like I’m missing something.