Translating robust logistic regression from rstan to brms

rstan

#1

I am interested in fitting a “robust” logistic regression model from Doing Bayesian Data Analysis (2nd Edition). This model treats data as a mixture of two sources: (1) guessing and (2) a logistic function of predictors. For guessing, the y values are thought to come from y ~ Bernoulli(m=1/2).

Using some data I found in this vignette, I think I’ve been able to implement the model in a simple case using rstan:

library("rstan")

# Prepare data 
url <- "http://stat.columbia.edu/~gelman/arm/examples/arsenic/wells.dat"
wells <- read.table(url)
X <- model.matrix(~ arsenic, wells)
standata <- list(y = wells$switch, arsenic = wells$arsenic, N = nrow(wells))

# Model
guess.model <- '
data {                          
  int<lower=0> N;                
  int<lower=0,upper=1> y[N];     
  vector[N] arsenic;            
}
parameters {
  real alpha;                   
  real b_arsenic;                
  real<lower=0,upper=1> guess;
}
transformed parameters { 
  vector[N] mu;
  for ( i in 1:N) {
    mu[i] = .5*guess + (1-guess)*inv_logit(alpha + b_arsenic * arsenic[i]);
  }
}
model {
  y ~ bernoulli(mu);
  alpha ~ normal(0,1);         
  b_arsenic ~ normal(0,1);     
  guess ~ beta(1,9);
}
'

# Fit model
fit <- stan(model_code = guess.model, data = standata)
print(fit, pars = c("alpha","b_arsenic","guess"))

However, I’m unsure how to implement the guessing parameter in brms. Could anyone point me in the right direction?

Operating System: Windows 10
Interface Version: 3.2.1


#2

You can learn more about that in vignette("brms_nonlinear")


#3

Many thanks to jrrae for posing this question. I’ve been working through Kruschke’s book and was initially baffled by this model, too (chapter 21.3). In case any others are curious, I was able to follow Paul’s lead and fit the model like so.

fit <-
  brm(data = my_data, 
      family = binomial(link = "identity"),
      bf(male | trials(1) ~ a * .5 + (1 - a) * 1 / (1 + exp(-1 * (b0 + b1 * weight_z))),
         a + b0 + b1  ~ 1,
         nl = TRUE),
      prior = c(prior(normal(0, 2), nlpar = "b0"),
                prior(normal(0, 2), nlpar = "b1"),
                prior(beta(1, 9),   nlpar = "a")))

Much like Paul cautioned in his vignette("brms_nonlinear"), the model yielded a number of divergent transitions which might well be alleviated with better priors. But the results match Kruschke’s pretty well.

Note. My model is based on the data and parameters from Kruschke’s text, not those in the vignette jrrae cited, above.


#4

If you want to fit a robust logistic regression, I recommend that instead of this unnatural (to me) mixture model, you use the robit model, which is like logistic regression but using the Student-t rather than the logistic distribution.


#5

Thank you for the suggestion, @andrewgelman! It appears one can find the details of that model from Gelman and Hill (2007), section 6.6. There we read:

Logistic regression can be conveniently “robustified” by generalizing the latent-data formulation (5.4):

y_i = \bigg \{ \begin{matrix} 1 \text{ if } z_i > 0\\ 0 \text{ if } z_i < 0 \end{matrix}
z_i = X_i \beta + \epsilon_i,

to give the latent errors \epsilon a t distribution:

\epsilon_i \sim t_\nu \bigg ( \frac{\nu - 2}{\nu} \bigg ),

with the degrees-of-freedom parameter \nu > 2 estimated from the data and the t distribution scaled so that its standard deviation equals 1.

That model currently outstrips my brms chops. Bonus Stan Forums points for anyone who can show the relevant brms code.


#6

We can surely write that model in brms. However, I am unsure about why we compute
(\nu - 2) / \nu? If we just wanted the errors to be (standard) t-distributed, wouldn’t we just write

\epsilon \sim t(\nu, 0, 1)

#7

Ah I see, it is to fix the SD to 1. That makes sense: (\nu - 2) / \nu is the adjusted scale parameter as var(\epsilon) = \nu / (\nu - 2). That is, we would write

\epsilon \sim t(\nu, 0, (\nu - 2) / \nu)

@andrewgelman Can you confirm this real quick?

Suppose, for now, that the above is correct. We could translate this to brms as follows (not tested, so please check carefully):

library(brms)

stan_inv_robit <- "
  real inv_robit(real y, real nu) {
    student_cdf(y, nu, 0, (nu - 2) / nu);
  }
"
stanvar_inv_robit <- stanvar(scode = stan_inv_robit, block = "functions")

bform <- bf(
  male | trials(1) ~ inv_robit(eta, nu),
  eta ~ <predictors>,
  nu ~ 1,
  nl = TRUE,
  family = bernoulli("identity")
)

bprior <- prior(normal(0, 5), nlpar = "eta") +
  prior(gamma(2, 0.1), nlpar = "nu", lb = 2)

fit <- brm(
  bform, prior = bprior,
  stanvars = stanvar_inv_robit,
  ...
)

#8

Hi, yes, I guess this would have a problem if nu is less than 2, so you’d want to constrain nu to be at least 2, or even to constrain it to be at least 3. I don’t know how to ping Aki on this so I’ll just cc him, also Jonah and Ben in case this ever came up in rstanarm.
See you

Andrew


#9

Thanks Andrew! Through prior(gamma(2, 0.1), nlpar = "nu", lb = 2) brms ensures that nu has a lower bound (lb) of 2. You can ping people via @“name” for instance @avehtari for Aki.