How to use cauchy distribution in varying intercepts model

I am reading McElreath’s book on Statistical Rethinking but I am following the code in Solomon Kurz’ e-book https://bookdown.org/content/4857/ as I want to be more proficient with brms.

Kurz has the following code for the multilevel tadpoles example (code source):

b13.2 <- 
  brm(data = d, 
      family = binomial,
      surv | trials(density) ~ 1 + (1 | tank),
      prior = c(prior(normal(0, 1.5), class = Intercept),  # alpha bar
                prior(exponential(1), class = sd)),        # sigma
      iter = 5000, warmup = 1000, chains = 4, cores = 4,
      sample_prior = "yes",
      seed = 13,
      file = "fits/b13.02")

However, this assumes that the distribution of the population of varying intercepts is Normally distributed with hyperparameters having priors of N(0,1.5^2) and Exp(1), respectively.

In the exercises, McElreath asks us to investigate the consequences when the population of varying intercepts instead has a cauchy distribution or a student’s t-distribtion. How do I implement this in brms? I can’t figure out what additional argument I have to specify. I assume that the default distribution of the population of varying intercepts is normal.

(Here’s a snippet of the exercise from the book, at least from the first edition:


I’m not asking about the HalfCauchy prior on \sigma but rather, the Cauchy prior on \alpha_{TANK})

Thanks!

1 Like

g’day @radicalprotenns ,
There’s a lot of detail in the help page for brms::set_prior, a really nice resource. (link)
Side note: I don’t think it’s perfect though, e.g. a link to the Stan home page is rather vague. A confusion I’ve had in the past is knowing what functions are available to you when specifying priors. I might see a default prior uses student_t but then would be confused that no function in R called student_t exists. The trick is that brms can use Stan functions in prior calls much like in your example. (And do read the Stan doc to know what you’re working with, e.g. does the Gamma distribution use the shape/scale or the shape/rate formulation?) To see those functions, go here (e.g. look under continuous distributions for the Cauchy). brms makes a lot of distributions available to toy with within R, just look for functions named ‘d’ something, like dstudent_t

I suggest getting in the habit of first creating the formula with bf and then inspecting default priors with this:

# recommend creating the brms formula first with bf
bf_13.2 <- bf(
  surv | trials(density) ~ 1 + (1 | tank),
  family = binomial
)
# then you can easily see what default priors (which can change depending on data)
get_prior(bf_13.2, data = d)
# so the Intercept's default of student_t(3, 0, 2.5) will be over-ridden

So you see that the default distribution of the pop-level intercept isn’t normal, but a Student t.

Now you can plot your candidates for priors:

# see how the priors differ before fitting
tibble(x = -20:20) %>% # set to wider scale to really see the heavy tails
  ggplot(aes(x))+
  stat_function(fun = dnorm, args = list(mean = 0, sd = 1.5)) +
  stat_function(fun = dstudent_t, args=list(df = 3, mu = 0, sigma = 2.5),
                color='orange')+
  stat_function(fun = dcauchy,args=list(location = 0, scale = 1.5),
                color='blue')
# cauchy has much heavier tails; so does the default student t
# but cauchy carries on for much further in the tail

And here’s the relevant change in the brm code:

prior = c(
        prior(cauchy(0, 1.5), class = Intercept),  # alpha bar
        prior(exponential(1), class = sd)  # sigma
      ), 

(I saw pretty much identical parameter estimates, but the normal prior seemed to have greater sampling efficiency for the pop-level intercept.)

I’d suggest a workflow of keeping model formulae and priors (as R objects) separate; specifying prior’s inside brm can work fine but gets difficult to keep track of when comparing several fits which only really differ in their priors. You can save a lot of space by just passing in the same bf each time.

hope that helps

Hi @zacho , thanks for your reply. I am not sure whether

prior = c(
        prior(cauchy(0, 1.5), class = Intercept),  # alpha bar
        prior(exponential(1), class = sd)  # sigma
      )

changes the distribution of the population of varying intercepts to Cauchy. If I understand correctly, the above results in the following model:
\alpha_j \sim Normal(\bar{\alpha}, \sigma^2), \bar{\alpha} \sim Cauchy(0,1.5), \sigma \sim Exp(1).

I’m saying this because in Solomon’s example (link here),
image
corresponds to this code:

b13.2 <- 
  brm(data = d, 
      family = binomial,
      surv | trials(density) ~ 1 + (1 | tank),
      prior = c(prior(normal(0, 1.5), class = Intercept),  # alpha bar
                prior(exponential(1), class = sd)),        # sigma
      iter = 5000, warmup = 1000, chains = 4, cores = 4,
      sample_prior = "yes",
      seed = 13,
      file = "fits/b13.02")

Am I missing something here?

Hey @radicalprotenns ,
Ahhhh, no you’re right. I didn’t catch that the first time around. Apologies.
I have the 2nd edition of the book and it’s in problem 13M3.
Does your edition have the ‘Rethinking’ text box about ‘Why Guassian tanks?’ – that at least covers why the convention is Gaussian.
I feel like I’ve seen someone hack this in brms before but I’m not finding it yet. (My 1st thought in the Stan code would be to change the line at the end of the model block: target += std_normal_lpdf(z_1[1]);. But I’m not certain if it’s really that simple to just switch out how standardised group-level effects are generated.)
Maybe @Solomon would know?

Actually, think I found it. Solomon has some useful links for his bookdown of the 1st edition text.

The linked github discussion is quite informative. Paul mentions that using the dist argument inside of gr() is the way to go… so long as it’s implemented. In brms version 2.21.0, the help page says that only the Gaussian is supported. But! This code will actually run for a Student-t version:

b13.2_test <- 
  brm(data = d, 
      family = binomial,
      surv | trials(density) ~ 1 + (1 | gr(tank, dist = 'student')), # note!
      prior = c(prior(normal(0, 1.5), class = Intercept),  # alpha bar
                prior(exponential(1), class = sd)),        # sigma
      iter = 5000, warmup = 1000, chains = 4, cores = 4,
      backend = 'cmdstanr', # edit: to speed up
      sample_prior = "yes",
      seed = 13)

Indeed trying anything else gives me: Error in match.arg(dist, c("gaussian", "student")) : 'arg' should be one of “gaussian”, “student”

2 Likes

I haven’t tried fitting a model like that in a long time, but I believe @zacho has it right. Well done, team!

2 Likes

Also, @radicalprotenns, it looks like this is your first time posting here. Welcome to the Stan forums!

1 Like

Thank you @zacho! So I guess Cauchy is not supported for now. No worries, I’ll use the rethinking package for this problem then!

Thank you for the welcome, @Solomon! I appreciate you having translated McElreath’s code to tidy verse + brms. I’m new to both when I first started the book and now, 13 chapters later, I’m still going and learning. Going to finish the entire book (2nd edition) for sure.

1 Like