Hello everyone,
I hope my question has not been asked and answered in any previous posts (I haven’t found it, so hopefully this won’t be redundant).
My question is about the estimation/interpretation of beta parameters of multinomial logistic regressions, in particular because I found (see below) they don’t change as much as I thought they would given two contrasting fake trends I made up. But first, a bit of context to explain the data and problem.
Briefly, I am investigating comments made by observers on insect pictures collected within the framework of the citizen science project Spipoll (English description here).
I would like to use noninformative weak priors following the definition in (Gelman and Hill, 2007), p355: “for a prior distribution to be noninformative, its range of uncertainty should be clearly wider than the range of reasonable values of the parameters." also suggested by email by Solomon Kurz and in a paper by Lemoine (2019).
However, I have some trouble to find the theoretical boundaries of beta parameters in my categorical model, so I don’t know what could be the “range of reasonable values of the parameters”.
Thus, to find out, I’ve modified my dataset in two contrasting versions that include more or less extreme trends (stronger than in the real data), then fitted a model on these two fake datasets and see how strong beta parameters can get.
The response variable of the model is the category of comments (seven levels: Recherche, Connaissance, Cycle, Contexte, Descriptif, Perspective, Curiosite), with the category Research (Recherche) set as the reference level.
The only fixed effect is the year each comment was posted (from 2010 to 2017, coded as a continuous variable from 1 to 8).
[Note that in the final model will be included a random effect for the identification code of the participant, as some participants contributed with multiple comments (i.e. dependence in the data that need to be accounted for)].
In fake data #1: the Research category makes up 4% of the comments in 2010, and gradually raises to 95% in 2017. See this 1st figure (Recherche category is on top in blue-green)
In fake data #2: the Research category also start at 4% and end at 95%; however, the Knowledge category (“Connaissance”, in orange) starts at 93% and ends at <1%. Thus, I here expect the effect of year on “Connaissance” to be the strongest. Indeed, category of comments Connaissance is emptied year after year to the sole benefit of Recherche comments. See this 2nd figure:
However, here are the estimates obtained by the two models fitted to these two datasets:
Results with fake #1 dataset:
Estimate | Est.Error | Q2.5 | Q97.5 | |
---|---|---|---|---|
muConnaissance_Intercept | 0.7351239 | 0.25745051 | 0.2319605 | 1.2389316 |
muCycle_Intercept | 0.2217930 | 0.26759794 | -0.3095479 | 0.7412367 |
muContexte_Intercept | 1.1219188 | 0.19293080 | 0.7495230 | 1.5029422 |
muDescriptif_Intercept | 1.4324363 | 0.22900320 | 0.9826459 | 1.8892241 |
muPerspective_Intercept | 1.5779264 | 0.20379257 | 1.1778803 | 1.9745000 |
muCuriosite_Intercept | 1.3196273 | 0.19046166 | 0.9520178 | 1.6968824 |
muConnaissance_year18 |
-0.6842431 |
0.06460922 | -0.8138866 | -0.5602087 |
muCycle_year18 | -0.5630291 | 0.06321881 | -0.6877448 | -0.4404182 |
muContexte_year18 | -0.5594056 | 0.04349774 | -0.6474972 | -0.4767574 |
muDescriptif_year18 | -0.7824418 | 0.05981112 | -0.9013440 | -0.6674225 |
muPerspective_year18 | -0.7227373 | 0.05059039 | -0.8229390 | -0.6235051 |
muCuriosite_year18 | -0.6070640 | 0.04416343 | -0.6951696 | -0.5224079 |
Results with fake #2 dataset:
Estimate | Est.Error | Q2.5 | Q97.5 | |
---|---|---|---|---|
muConnaissance_Intercept | 3.0045764 | 0.14385464 | 2.7221784 | 3.28827696 |
muCycle_Intercept | -3.1001033 | 0.73311313 | -4.6282497 | -1.75276506 |
muContexte_Intercept | -2.3180013 | 0.52993671 | -3.4016725 | -1.32090800 |
muDescriptif_Intercept | -3.6003164 | 0.81856338 | -5.3035236 | -2.11048010 |
muPerspective_Intercept | -2.2530296 | 0.58229061 | -3.4710633 | -1.15387281 |
muCuriosite_Intercept | -2.3287323 | 0.51415007 | -3.3665910 | -1.36098108 |
muConnaissance_year18 |
-0.6978025 |
0.03133402 | -0.7591399 | -0.63567196 |
muCycle_year18 | -0.2595350 | 0.14854906 | -0.5579470 | 0.02516362 |
muContexte_year18 | -0.2836348 | 0.10936605 | -0.5028316 | -0.07380743 |
muDescriptif_year18 | -0.1699583 | 0.15681366 | -0.4779084 | 0.14062106 |
muPerspective_year18 | -0.3451949 | 0.12414934 | -0.5854095 | -0.10307424 |
muCuriosite_year18 | -0.2578751 | 0.10380112 | -0.4635130 | -0.05404364 |
As you can see, the both estimates for the effect of year on category of comments Connaissance (see muConnaissance_year18
) are very similar.
So, finally :-), my question is:
why are the two estimates so similar despite the strikingly different trends?
Thanks for reading until here, I hope it is clear enough. I’m looking forward hearing any thoughts on this.
Best,
Nicolas D.
See below for the code of the models:
fit_fake1<-brm(classe ~ year18, family = categorical, iter = 8000, chains = 3
,data = fake1, prior = kurz_vague_priors)
fit_fake2<-brm(classe ~ year18, family = categorical, iter = 8000, chains = 3
,data = fake2, prior = kurz_vague_priors)
with
kurz_vague_priors<-c(
prior(normal(0, 20), class = b, dpar = muConnaissance),
prior(normal(0, 20), class = b, dpar = muContexte),
prior(normal(0, 20), class = b, dpar = muCuriosite),
prior(normal(0, 20), class = b, dpar = muCycle),
prior(normal(0, 20), class = b, dpar = muDescriptif),
prior(normal(0, 20), class = b, dpar = muPerspective)
)
---- References used above:
Gelman, A., Hill, J., 2007. Data analysis using regression and multilevel/hierarchical models. Cambridge University Press, Cambridge. https://doi.org/10.1017/CBO9780511790942
Lemoine, N.P., 2019. Moving beyond noninformative priors: why and how to choose weakly informative priors in Bayesian analyses. Oikos 128, 912–928. https://doi.org/10.1111/oik.05985