Bernouilli/categorical model where responses don't vary within levels of the grouping variable

VRWizard · March 18, 2022, 5:20pm

Hi all,

I am new to brms and bayes and am just wondering if I could get any insights in how to specify this model (and whether a bayesian approach would help).

I am testing if members of female_group (“A”, “N”, “L”) are more likely to reproduce with members of male_group (“A”, “N”, “L”), and I want to take into account variation in preferences of each individual female and male : (1| female_id) + (1|male_id). Each female_id reproduced multiple times, sometimes with different male_id.

To simplify, I collapsed the response from 3 possible categories (male group) into 2 (conspecific vs. heterospecific) to run as a bernouilli binomial glmm. i.e.

male_group ~ female_species + male_species + (1|female_id) + (1|male_id).

However, when I do this, I get unusual results, including very large variance and standard deviation for the female_id random effect and the full model has an AIC of ~ 60 units worse than the reduced model. After searching around I believe I have the same issue as a user here:

that is within the female_id grouping variable, responses are usually either all 0 or all 1, so extend to infinity for the response. Ben Bolker then suggests to collapse the random effects and to run as a glm, but if I do this I lose the male_id random effect.

Is there anyway I can fit a model that still includes these factors or is the issue inherent in the data/modelling structure?

I was attracted to brms as it is possible to generate a model with a categorical response rather than collapse to a binomial as I have done so far. But I assume the problem in the binomial glmm would still persist here even given appropriate priors etc.?

mike-lawrence · March 18, 2022, 7:18pm

Can you post some example data? I’m having trouble following from your description.

VRWizard · March 21, 2022, 11:11am

Hi, attached is the data, it is quite sparse in terms of levels within grouping variables, which is probably part of the problem:

brms_dat.csv (6.8 KB)

I tried running a categorical model with brms like so:

fit2 <- brm(male_group ~ female_group + (1|female_id)  +
             (1|male_id_1), 
           family = categorical(link = "logit"), chains = 4, cores = 4,
           data = brms_sym)

I received quite a few warnings:

Warning messages:
1: There were 688 divergent transitions after warmup. See
https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
to find out why this is a problem and how to eliminate them. 
2: Examine the pairs() plot to diagnose sampling problems
 
3: The largest R-hat is 1.06, indicating chains have not mixed.
Running the chains for more iterations may help. See
https://mc-stan.org/misc/warnings.html#r-hat 
4: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
Running the chains for more iterations may help. See
https://mc-stan.org/misc/warnings.html#bulk-ess 
5: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
Running the chains for more iterations may help. See
https://mc-stan.org/misc/warnings.html#tail-ess

This was the the summary:

> summary(fit2)
 Family: categorical 
  Links: muL = logit; muN = logit 
Formula: male_group ~ female_group + (1 | female_id) + (1 | male_id_1) 
   Data: brms_sym (Number of observations: 191) 
  Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
         total post-warmup draws = 4000

Group-Level Effects: 
~female_id (Number of levels: 27) 
                  Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(muL_Intercept)     2.06      2.01     0.07     7.03 1.01      388      260
sd(muN_Intercept)     2.41      2.94     0.05     9.11 1.00     1099     1047

~male_id_1 (Number of levels: 17) 
                  Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(muL_Intercept)    85.28    195.70     8.98   285.34 1.03      119      124
sd(muN_Intercept)    47.49     57.11     5.52   234.48 1.04      115      104

Population-Level Effects: 
                  Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
muL_Intercept       -23.54     34.23  -130.04    34.23 1.06       62       25
muN_Intercept       -19.75     23.84   -81.05    10.16 1.01      425      397
muL_female_groupL    74.76     83.42   -25.17   332.09 1.06       58       29
muL_female_groupN    -0.53     30.23   -69.05    66.41 1.04       77       35
muN_female_groupL    41.91     53.86   -32.03   175.52 1.04      117      248
muN_female_groupN    17.88     31.44   -42.92   111.18 1.03      122       58

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

I am just reading through the documentation for brms, so I’m not really sure how to interpret these warnings and output.

Thanks,
Mike

Topic		Replies	Views
Repeated measure logistic regression Modeling brms	5	818	October 31, 2022
Binomial hierarchical gam brms	3	1392	May 10, 2018
Binomial instead of bernoulli - is there something similar for categorical and ordinal models? brms techniques	4	737	August 9, 2022
Divergences with logistic multilevel model with binary predictor Modeling specification , divergences , brms	4	63	September 5, 2024
Error: Family 'binomial' requires numeric responses Modeling brms	2	781	September 11, 2022

Bernouilli/categorical model where responses don't vary within levels of the grouping variable

Related topics