Nu and multivariate Student’s t


#1

Please also provide the following information in addition to your question:

  • Operating System: Mac OS High Sierra
  • brms Version: 2.3.1

I recently read Adrian Baez-Ortega’s great blog on “Bayesian robust correlation with Stan in R” (https://baezortega.github.io/2018/05/28/robust-correlation/), in which he used the Student’s t distribution to handle outliers in correlations. If you use the default family = gaussian, specify multiple response variables (e.g., with the cbind() syntax), and fit an intercept-only model, you can get Bayesian correlations within brms. E.g., this works great:

N <- 1000

set.seed(1)
d <-
  tibble(x = rnorm(N, 0, 1),
         y = rnorm(N, mean = 0 + 1*x, 1))

fit1 <-
  brm(data = d, family = gaussian,
      cbind(x, y) ~ 1)

print(fit1)

However, the method fails when switching to family = student. When I try the following:

fit2 <-
  brm(data = d, family = student,
      cbind(x, y) ~ 1, 
      cores = 4)

I get this error message:

“Error in new_CppObject_xp(fields$.module, fields$.pointer, …) :
Exception: modeldcb6526990f7_filedcb64a253b90_namespace::modeldcb6526990f7_filedcb64a253b90: nu_x is 0, but must be greater than or equal to 1 (in ‘modeldcb6526990f7_filedcb64a253b90’ at line 26)”

In the “lb” subsection in the “set_prior” section of the brms reference manual I’m told “Lower bound for parameter restriction. Currently only allowed for classes ‘b’, ‘ar’, ‘ma’, and ‘arr’." And indeed when I attempt lb = 1 for nu:

fit3 <-
  brm(data = d, family = student,
      cbind(x, y) ~ 1,
      prior = c(set_prior("gamma(2, .1)", class = "nu", lb = 1)),
      cores = 4)

I get the error message: “Error: Currently boundaries are only allowed for population-level and autocorrelation parameters.”

And yet it works great if I simply go univariate.

fit4 <-
  brm(data = d, family = student,
      y ~ 1 + x,
      cores = 4)

print(fit4)

Am I missing something?


#2

Fixing a parameter to 0 while requiring it to be >= 1 was probably one of the more stupid things I have done so far ;-)

It should now be fixed in the dev version from github.


#3

Success! fit2 from above

fit2 <-
  brm(data = d, family = student,
      cbind(x, y) ~ 1, 
      cores = 4)

print(fit2)

yields:

 Family: MV(student, student) 
  Links: mu = identity; sigma = identity; nu = identity
         mu = identity; sigma = identity; nu = identity 
Formula: x ~ 1 
         y ~ 1 
   Data: d (Number of observations: 1000) 
Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
         total post-warmup samples = 4000

Population-Level Effects: 
            Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
x_Intercept    -0.01      0.03    -0.08     0.05       3370 1.00
y_Intercept    -0.03      0.05    -0.12     0.06       3385 1.00

Family Specific Parameters: 
            Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
sigma_x         1.01      0.02     0.96     1.06       3034 1.00
sigma_y         1.43      0.04     1.36     1.50       2829 1.00
nu             39.59     13.96    19.80    73.23       3669 1.00
rescor(x,y)     0.70      0.02     0.67     0.74       3206 1.00

Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample 
is a crude measure of effective sample size, and Rhat is the potential 
scale reduction factor on split chains (at convergence, Rhat = 1).

This is just great. Thanks Paul.