Regarding Outcomes; A or not A vs. A or B, is there a difference? Help a beginner building model

Fatih_Bozdag · August 10, 2021, 11:28am

Greetings,

I am trying to do my very first Bayesian analysis and as a beginner I have a bunch of questions. Below you can see a snapshoot of my data. Briefly, I would like to observe effects of categorical predictors along with two numeric ones over a categorical outcome with two factors.

Therefore, my model is as follows;
df_model <- stan_glmer(Alternation ~ (1 |Native_Language) + Agent_Pos + Agent_Animacy + Semantic_Class + Theme_Pos + Theme_Animacy + Theme_length + Recipient_Pos + Recipient_Animacy + Recipient_length, data = df, family = binomial(link = "logit"))

and the result is;

stan_glmer
 family:       binomial [logit]
 formula:      Alternation ~ (1 | Native_Language) + Agent_Pos + Agent_Animacy + 
	   Semantic_Class + Theme_Pos + Theme_Animacy + Theme_length + 
	   Recipient_Pos + Recipient_Animacy + Recipient_length
 observations: 3485
------
                           Median MAD_SD
(Intercept)                -1.2    0.4  
Agent_PosPRON              -0.1    0.1  
Agent_PosPROPN              0.1    0.2  
Agent_AnimacyInanimate     -0.4    0.1  
Semantic_Classc            -0.9    0.2  
Semantic_Classf             0.5    0.2  
Semantic_Classnd           -1.8    0.3  
Semantic_Classp            -2.7    0.4  
Semantic_Classt             0.4    0.2  
Theme_PosPRON               1.3    0.2  
Theme_PosPROPN              0.5    0.5  
Theme_AnimacyInanimate      0.6    0.3  
Theme_length               -1.2    0.3  
Recipient_PosPRON          -1.5    0.1  
Recipient_PosPROPN          0.0    0.3  
Recipient_AnimacyInanimate  0.6    0.1  
Recipient_length            2.3    0.3  

Error terms:
 Groups          Name        Std.Dev.
 Native_Language (Intercept) 0.29    
Num. levels: Native_Language 27 

------
* For help interpreting the printed output see ?print.stanreg
* For info on the priors used see ?prior_summary.stanreg

and also;

Model Info:
 function:     stan_glmer
 family:       binomial [logit]
 formula:      Alternation ~ (1 | Native_Language) + Agent_Pos + Agent_Animacy + 
	   Semantic_Class + Theme_Pos + Theme_Animacy + Theme_length + 
	   Recipient_Pos + Recipient_Animacy + Recipient_length
 algorithm:    sampling
 sample:       4000 (posterior sample size)
 priors:       see help('prior_summary')
 observations: 3485
 groups:       Native_Language (27)

Estimates:
                                                   mean   sd   10%   50%   90%
(Intercept)                                      -1.2    0.4 -1.7  -1.2  -0.7 
Agent_PosPRON                                    -0.1    0.1 -0.3  -0.1   0.0 
Agent_PosPROPN                                    0.1    0.2 -0.2   0.1   0.3 
Agent_AnimacyInanimate                           -0.4    0.1 -0.5  -0.4  -0.3 
Semantic_Classc                                  -0.9    0.2 -1.2  -0.9  -0.6 
Semantic_Classf                                   0.5    0.2  0.3   0.5   0.7 
Semantic_Classnd                                 -1.9    0.3 -2.2  -1.8  -1.5 
Semantic_Classp                                  -2.8    0.4 -3.3  -2.7  -2.2 
Semantic_Classt                                   0.4    0.2  0.2   0.4   0.6 
Theme_PosPRON                                     1.3    0.2  1.0   1.3   1.5 
Theme_PosPROPN                                    0.4    0.5 -0.2   0.5   1.1 
Theme_AnimacyInanimate                            0.6    0.3  0.3   0.6   0.9 
Theme_length                                     -1.2    0.3 -1.5  -1.2  -0.8 
Recipient_PosPRON                                -1.5    0.1 -1.6  -1.5  -1.3 
Recipient_PosPROPN                                0.0    0.3 -0.3   0.0   0.4 
Recipient_AnimacyInanimate                        0.6    0.1  0.5   0.6   0.8 
Recipient_length                                  2.3    0.3  1.9   2.3   2.7 
b[(Intercept) Native_Language:Bulgarian]          0.2    0.2  0.0   0.2   0.4 
b[(Intercept) Native_Language:Chinese]           -0.1    0.2 -0.3   0.0   0.2 
b[(Intercept) Native_Language:Chinese-Cantonese]  0.3    0.1  0.2   0.3   0.5 
b[(Intercept) Native_Language:Czech]              0.3    0.2  0.1   0.3   0.6 
b[(Intercept) Native_Language:Dutch]             -0.1    0.2 -0.3  -0.1   0.2 
b[(Intercept) Native_Language:Finnish]            0.2    0.2  0.0   0.2   0.5 
b[(Intercept) Native_Language:French]            -0.3    0.2 -0.6  -0.3   0.0 
b[(Intercept) Native_Language:German]             0.1    0.2 -0.2   0.1   0.3 
b[(Intercept) Native_Language:Greek]             -0.3    0.2 -0.5  -0.3   0.0 
b[(Intercept) Native_Language:Hungarian]         -0.3    0.2 -0.5  -0.2   0.0 
b[(Intercept) Native_Language:Italian]            0.3    0.2  0.1   0.3   0.6 
b[(Intercept) Native_Language:Japanese]          -0.2    0.2 -0.4  -0.2   0.0 
b[(Intercept) Native_Language:Korean]             0.0    0.2 -0.2   0.0   0.2 
b[(Intercept) Native_Language:Lithuanian]         0.0    0.2 -0.3   0.0   0.2 
b[(Intercept) Native_Language:Macedonian]        -0.4    0.2 -0.6  -0.4  -0.1 
b[(Intercept) Native_Language:Norwegian]          0.0    0.2 -0.2   0.0   0.3 
b[(Intercept) Native_Language:Persian]           -0.2    0.2 -0.5  -0.2   0.0 
b[(Intercept) Native_Language:Polish]            -0.2    0.2 -0.4  -0.2   0.0 
b[(Intercept) Native_Language:Portuguese]         0.2    0.2  0.0   0.2   0.5 
b[(Intercept) Native_Language:Punjabi]            0.0    0.2 -0.3   0.0   0.3 
b[(Intercept) Native_Language:Russian]            0.0    0.2 -0.2   0.0   0.3 
b[(Intercept) Native_Language:Serbian]            0.1    0.2 -0.2   0.1   0.3 
b[(Intercept) Native_Language:Spanish]            0.2    0.2  0.0   0.2   0.5 
b[(Intercept) Native_Language:Swedish]            0.2    0.2  0.0   0.2   0.4 
b[(Intercept) Native_Language:Tswana]            -0.1    0.2 -0.4  -0.1   0.1 
b[(Intercept) Native_Language:Turkish]            0.0    0.2 -0.2   0.0   0.3 
b[(Intercept) Native_Language:Urdu]              -0.1    0.2 -0.4  -0.1   0.2 
Sigma[Native_Language:(Intercept),(Intercept)]    0.1    0.0  0.0   0.1   0.1 

Fit Diagnostics:
           mean   sd   10%   50%   90%
mean_PPD 0.3    0.0  0.3   0.3   0.3  

The mean_ppd is the sample average posterior predictive distribution of the outcome variable (for details see help('summary.stanreg')).

First Question; As far as I get, the basic concepts underlying Bayesian statistics is to assess chance of success i.e if the outcome is A or not A. For my case, outcome is either A or B, therefore success rate of A means failure rate of B, am I correct?

Second Question; Is my formula given above correct? Again 11 predictors and 1 outcome with 2 factors. Native_Language is the random effect over outcomes.

Third Question, How should I interpret the results? From a frequncy approach, it is possible to see if the likely outcome is A or B given the predictors. Is it possible to observe the same results with Bayesian framework? is so, how? ShinyStan graphics do not help much :/

I have never taken statistics or any related subjects since I graduated with a degree in Educational Sciences, more particularly Language Teaching (also have Ph.D. in the same field and I am familiar with many types of statistics due to studies in Corpus Linguistics). I am a self-learner type, I would really appreciate less technical explanations if possible.

jsocolar · August 10, 2021, 4:08pm

Welcome to Discourse!

yes & yes

The parameter estimates from a Bayesian analysis of a model mean the exact same thing as the parameter estimates from a frequentist analysis of the same model. You can interpret them in exactly the same way. The difference between the Bayesian and Frequentist approaches has to do with how the estimates are obtained. There are two main differences:

In Bayesian analysis, there is a prior that influences the computation.
In Bayesian model fitting with MCMC (which is what Stan does by default), you don’t get just one estimate, but rather an entire posterior distribution of estimates. Any one of these posterior samples is interpretable just like the frequentist estimate, but the ensemble of the posterior samples fully captures the uncertainty (according to the model you’ve specified) in what the estimate says or means.
To interpret the ensemble of posterior samples, we do our interpretation on each sample, and then we look at the posterior distribution of what our interpretation is. For example, if “interpreting” means checking whether a parameter is “large” or “small”, we check this across all posterior iterations to figure out how certain we are about our interpretation. If “interpreting” means predicting results at new data values, then we do this prediction for each posterior iteration, and we look at the posterior distribution of our predictions.

Fatih_Bozdag · August 10, 2021, 9:16pm

Awesome explanation! All clear now. One more question about priors, when dealing with priors, we state them in terms of what? For instance;

prior = [normal](0, 5)
prior_intercept = [student_t](4, 0, 10),
prior_aux = [cauchy](0, 3)

These numbers 0, 3, 4, 5, 10 are values of? mean?, median? etc? When we are defining priors, how do we define them? More importantly what do they actually stand for? I got the general idea of setting priors yet not sure about the details.

Topic		Replies	Views
Measurement error in a categorical outcome variable Modeling	3	507	March 29, 2018
Help to interpret a GLM bayesian result brms	3	835	November 29, 2021
Interpreting output of gamma model Modeling interpret-results	5	496	May 24, 2021
Help with understanding of bayesian inference using GLM and BRMS brms	3	1394	November 13, 2018
Main effect in a bayesian model Modeling	1	466	September 4, 2022

Regarding Outcomes; A or not A vs. A or B, is there a difference? Help a beginner building model

Related topics