Problems with link functions when fitting a GLM with Beta distribution

Hello, I am running into a difficulty when trying to fit a pair of GLMs with the Beta distribution with parameters mu and phi.
The first (m1.1broms) model has the same explanatory variables and interactions for mu and phi, and the second (m1.brms) only for mu. I then want to use loo_cv to select the most appropriate model. Their formulae are:

formula.m1=bf(Totalcover~Gr.pressure*Prod + patch.size*Prod+inter.patch.dist))+
                   lf(phi~1)
formula.m2=bf(Totalcover~Gr.pressure*Prod + patch.size*Prod+inter.patch.dist)+
                lf(phi~Gr.pressure*Prod + patch.size*Prod+inter.patch.dist)

I am using the same priors:

prior.m.brms = c(set_prior("normal(0,0.5)",class = "b"),
                  set_prior("normal(logit(0.5),(logit(0.8)-logit(0.2))/4", class = "Intercept"))

The model calls are:

m1=brm(formula=formula.m1, family=Beta (link = "logit", link_phi = "log"), 
     warmup =    1000, data=cover.s, prior=prior.m.brms,chains=3, iter=3000, future=T,
     control =    list(adapt_delta =0.9999, max_treedepth=15))

and

m2=brm(formula=formula.m2, family=Beta(link = "logit", link_phi = "log"), 
            warmup = 1000,data=cover.s, prior=prior.m.brms,chains=3, iter=3000, future=T,
             control = list(adapt_delta =0.9999, max_treedepth=15))

Here is the output for m1:

> summary(m1)
 Family: beta 
  Links: mu = logit; **phi = identity** 
Formula: Totalcover ~ Gr.pressure * Prod + patch.size * Prod + inter.patch.dist 
   Data: cover.s (Number of observations: 53) 
Samples: 3 chains, each with iter = 3000; warmup = 1000; thin = 1;
         total post-warmup samples = 6000

Population-Level Effects: 
                       Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept                  0.10      0.02     0.06     0.15 1.00     2130     3326
Gr.pressure                0.06      0.04    -0.01     0.13 1.00     1110     2141
ProdLow                    0.09      0.05    -0.01     0.20 1.00     3191     3858
ProdMedium                 0.03      0.03    -0.04     0.09 1.00     2287     2320
patch.size                 0.35      0.03     0.29     0.40 1.00     1579     2849
inter.patch.dist          -0.21      0.02    -0.24    -0.17 1.00     1679     2994
Gr.pressure:ProdLow       -0.06      0.04    -0.14     0.01 1.00     1284     2671
Gr.pressure:ProdMedium    -0.06      0.04    -0.13     0.00 1.00     1324     2621
ProdLow:patch.size         0.13      0.06     0.02     0.24 1.00     1825     2837
ProdMedium:patch.size     -0.06      0.04    -0.14     0.01 1.00     1801     2922

Family Specific Parameters: 
    Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
phi   930.73    201.06   581.98  1365.72 1.00     3835     3254 

The posterior mean for phi seems very wrong and the model used the identity link function rather than the log link function for phi that I had specified (or at least I thought so).
I also tried:

brm(formula=Totalcover~Gr.pressure*Prod + patch.size*Prod+inter.patch.dist, family=Beta(link = "logit", link_phi = "log"), warmup = 1000,data=cover.s, prior=prior.m1.1brms,chains=3, iter=3000, future=T,
             control = list(adapt_delta =0.9999, max_treedepth=15))

with identical results.
The output for model m2 appears reasonable and the correct link functions were used.

Before jumping to any model selection, I wanted to make sure that the two models are correctly fitted,

My guess is that I am making a syntax error in the model specification, but I cannot find it.

Thanks in advance for any help or suggestion that you might provide.
Cheers
Pablo

I use

  • Operating System: Linux Ubunty 18.04
  • brms Version: 2.15.0
  • R version 3.6.0
1 Like

Hi, sorry for not getting to you earlier, your question is relevant and well written.

Unfortunately, I am unable to reproduce the problem you have: are you sure that the code you shared matches the output? In particular, there is an extra right parentheses in your code for formula.m1.

The point is that if you don’t provide a formula for phi, it gets estimated with “identity” link, but if you provide a formula, it should have a log link, as an example:

dat <- data.frame(y = rbeta(100, 5, 3))
brm(bf(y ~ 1, phi ~ 1), data = dat, family = Beta) 

gives:

 Family: beta 
  Links: mu = logit; phi = log 
Formula: y ~ 1 
         phi ~ 1
   Data: dat (Number of observations: 100) 
Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
         total post-warmup samples = 4000

Population-Level Effects: 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept         0.45      0.06     0.33     0.58 1.00     2904     2428
phi_Intercept     2.24      0.13     1.97     2.49 1.00     3312     2570

with lf it is the same:

brm(bf(y ~ 1) + lf(phi ~ 1), data = dat, family = Beta) 

gives

 Family: beta 
  Links: mu = logit; phi = log 
Formula: y ~ 1 
         phi ~ 1
   Data: dat (Number of observations: 100) 
Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
         total post-warmup samples = 4000

Population-Level Effects: 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept         0.46      0.06     0.33     0.58 1.00     3574     2490
phi_Intercept     2.24      0.14     1.95     2.49 1.00     3413     2310

but not adding formula for phi at all

brm(bf(y ~ 1), data = dat, family = Beta)

gives:

 Family: beta 
  Links: mu = logit; phi = identity 
Formula: y ~ 1 
   Data: dat (Number of observations: 100) 
Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
         total post-warmup samples = 4000

Population-Level Effects: 
          Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept     0.45      0.06     0.33     0.58 1.00     3622     2767

Family Specific Parameters: 
    Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
phi     9.51      1.26     7.25    12.17 1.00     3447     2859

Does that resolve your issue?

Dear Martin,
Thanks for you concern and for looking at my problem.
I had finally managed to solve it through a series of trials and errors that led me to delete the offending parenthesis to obtain the same output as you. In most R packages, wrong parenthesis quickly lead to errors that are obvious, but my mistake did not which puzzled me for a while.
Cheers
Pablo

1 Like