Non-normal random effects in GLMM

ozgurasar · January 22, 2019, 8:29am

Hi,

In a logistic model say, logit[P(Y = 1|…)] = x \beta + B_0, can one specifiy the distribution of B_0|… as t(0, scale, dof), in brms?

Thanks in advance,

Özgür Asar

paul.buerkner · January 22, 2019, 8:33am

See https://github.com/paul-buerkner/brms/issues/231 for how to model t-distributed random effects.

ozgurasar · January 23, 2019, 7:08am

For t, what parametrisation do you use? Is it \sigma Z/\sqrt{V/\nu}, where Z is standard Normal, and
V is chi-square with d.o.f. of \nu??

ozgurasar · January 24, 2019, 3:58pm

Hi Paul,
when I generate Stan code using {\tt brms} for binary mixed model with t-distributed random intercept:

\begin{equation} \mbox{logit}(P(Y_{ij}= 1|...)) = {\boldsymbol x}_{ij} {\boldsymbol \beta} + U_i; \ i = 1, \ldots, n, \ j = 1, \ldots, m, \end{equation}

where U_i \sim t(\phi, 0, \sigma), below is what I got. It seems you parameterise U_i as U_i = \sigma Z_i \sqrt{V * \phi}, where Z_i \sim N(0, 1), and V is scalar and V \sim {\mbox Inverse} \ \chi^2(\phi).
My question is: shouldn’t V be V_i, as use of V introduce dependency between U_i and U_i^{\prime} terms for i \neq i^{\prime}?

Thanks,

Özgür

// generated with brms 2.7.0
functions { 
} 
data { 
  int<lower=1> N;  // total number of observations 
  int Y[N];  // response variable 
  int<lower=1> K;  // number of population-level effects 
  matrix[N, K] X;  // population-level design matrix 
  // data for group-level effects of ID 1
  int<lower=1> J_1[N];
  int<lower=1> N_1;
  int<lower=1> M_1;
  vector[N] Z_1_1;
  int prior_only;  // should the likelihood be ignored? 
} 
transformed data { 
  int Kc = K - 1; 
  matrix[N, K - 1] Xc;  // centered version of X 
  vector[K - 1] means_X;  // column means of X before centering 
  for (i in 2:K) { 
    means_X[i - 1] = mean(X[, i]); 
    Xc[, i - 1] = X[, i] - means_X[i - 1]; 
  } 
} 
parameters { 
  vector[Kc] b;  // population-level effects 
  real temp_Intercept;  // temporary intercept 
  vector<lower=0>[M_1] sd_1;  // group-level standard deviations
  vector[N_1] z_1[M_1];  // unscaled group-level effects
  // parameters for student-t distributed group-level effects
  real<lower=1> df_1;
  real<lower=0> udf_1;
} 
transformed parameters { 
  // group-level effects 
  vector[N_1] r_1_1 = sqrt(df_1 * udf_1) * sd_1[1] * (z_1[1]);
} 
model { 
  vector[N] mu = temp_Intercept + Xc * b;
  for (n in 1:N) { 
    mu[n] += r_1_1[J_1[n]] * Z_1_1[n];
  } 
  // priors including all constants 
  target += student_t_lpdf(temp_Intercept | 3, 0, 10); 
  target += student_t_lpdf(sd_1 | 3, 0, 10)
  - 1 * student_t_lccdf(0 | 3, 0, 10); 
  target += normal_lpdf(z_1[1] | 0, 1);
  target += gamma_lpdf(df_1 | 2, 0.1); 
  target += inv_chi_square_lpdf(udf_1 | df_1);
  // likelihood including all constants 
  if (!prior_only) { 
    target += bernoulli_logit_lpmf(Y | mu);
  } 
} 
generated quantities { 
  // actual population-level intercept 
  real b_Intercept = temp_Intercept - dot_product(means_X, b); 
}

paul.buerkner · January 24, 2019, 4:05pm

Here is the definition of (multivariate) student-t that I am using:

ozgurasar · January 24, 2019, 4:11pm

The definition is correct. But the thing with mixed models is that random effects belonging to different subjects are assumed to be independent. Scalar V violates this.

Based on my simulations, scalar V introduces biases to \phi and \sigma.

Özgür

paul.buerkner · January 24, 2019, 4:15pm

Ah I get your point now, thanks! @bgoodri can you confirm that in the above Stan code, udf_1 should be a vector instead of a scalar?

bgoodri · January 24, 2019, 6:28pm

It can be a vector. Whether it can be called multivariate t is a long-standing debate.

paul.buerkner · January 24, 2019, 7:46pm

Thanks! In the GitHub version of brms, udf_* should now be a vector. @ozgurasar would you mind taking a look and confirm that the parameterization now looks correct to you?

ozgurasar · January 25, 2019, 8:19am

Yes, it now expresses the model that is intended to be fit.
One observation is that fitting a mixed model for categorical data with t random effects is quite difficult, probably due to identifying heavy tailedness through categorical data.

An example is the following (sim_data.txt (58.2 KB) is a simulated data):

devtools::install_github("paul-buerkner/brms")
brms_fit <- brm(y ~ x1 + x2 + x3 + (1|gr(id, dist = "student")), 
                data = sim_data,
                cores = 4,
                control = list(adapt_delta = 0.999),
                seed = 123,
                family = "bernoulli")
summary(brms_fit)

paul.buerkner · January 25, 2019, 8:42am

Thanks for checking!

I have made similar observations with categorical responses or other responses not containing a lot of information. I still want to do extensive simulations with t-distributed random effects (and perhaps write a paper about it) at some point, before I “officially” support them.

fcsalgado · April 3, 2023, 12:21am

Thanks for highlighting, I am experiencing this problem right now. Do you have any idea how to solve the heavy tailedness? is it a bad practice if I change the scale of the student_t prior from 2.5 to 0.5? Thanks a lot for your help

Topic		Replies	Views
Altered fixed effect parameters when using different random effect distributions on same data brms	1	356	May 16, 2021
Gaussian vs. t distribution for RE distributions in brms model - discrepancy in fixed effects Modeling	3	538	November 5, 2020
Brms estimating group variances of normal responses brms	1	356	March 21, 2020
Multi-modal distribution for the random effect (infinite mixture model with a Dirichlet process prior)- Multilevel model brms rstan	1	679	March 29, 2020
Understand how `brms` parametrizes random effects brms	7	3130	February 10, 2019

Non-normal random effects in GLMM

Related topics