MLM with NL component - difficulty with syntax

Rokh42 · February 28, 2021, 11:18pm

Hi, I’ve spent the past week going through all these forum posts, vignettes, google etc and still cannot find an answer for my problem. I’m hoping someone can help clarify.
I am building a mixed MLM model- the linear component of the model is already working. My data consists of 1 response variable (Resp_var, continuous), subject ID (ID_var, factor), grouping variable (Group_var, 4 level factor), a time variable (Time_var, continuous), and another covariate (CoVar, continuous but unique to each subject as a baseline value, doesn’t depend on time). My linear MLM syntax is:

Mod <- brm(Resp_var ~ (Time_var|Group_var) +
 (CoVar* Group_var) + (Group_var|Id_var), 
            data = dat)

My two intercept represent (Time_var|Group_var) : the s random Group effect for time parameter and (Group_var|Id_var): the subjects random effects per Group.

Based on prior research/knowledge of the subject we’re studying, we know we need to add on a nonlinear piece to accurately capture the Resp_var Similar to other research performed on this topic, the nonlinear component is modeled as a Weibull decay curve in this format (t/λ)^k, where t= Time_var), λ = scale param, k = shape param. I don’t know the values of λ (but due to the decay <0), and k (except 0<k<1). So the NL component I wish to add on is (newtime / lambda) ^ kappa.

I’ve searched the forums and particular these github links: 46, 47 where @paul.buerkner explains how to incorporate a NL component into a linear model, and followed the examples, but none of the syntaxes I’ve tried have worked. Because my Grouping variable is a 4 level factor, I know I can only put in into the nonlinear listing, but I keep getting errors.

Here is one syntax I tried, going off of the information on those two links (where b1=lambda, b2= kappa)

brm(formula = Resp_var ~ b - I((Time_var / b1) ^ b2), data = dat,
    nonlinear = list(b ~ (Gr_var | Id_var)  + CoVar * Gr_var + (Time_var|Gr_var),
                     b1 ~ 1, b2 ~ 1),
    prior=c(set_prior("normal(0,10)", nlpar="b"),
            set_prior("uniform(0,1)", nlpar="b2", lb=0, ub=1),
            set_prior("cauchy(0,10)", nlpar="b1", ub=0)))

Which brings up the error “Error: The following variables are missing in ‘data’:
‘b’”

I’ve tried using several other syntaxes and would come up with similar errors. I was able to use the same syntax and get the priors to work:

my_prior <- brms::get_prior(
  brms::bf(Resp_var ~ b -((Time_var/b1)^b2), 
           nonlinear= list(b ~ (Time_var|Gr_var) + CoVar:Gr_var + (Gr_var|Id_var),
                           b1 ~ 1, b2 ~ 1),
           nl=TRUE),
  data= dat,
  family = gaussian())

My apologies, as this is clinical data, I can’t share my dataset but if it would be helpful I can recreate a fake one.

Operating System: Windows 10, R studio 1.4.1103
brms Version: 2.14.4

martinmodrak · March 5, 2021, 8:36pm

Sorry for not getting to you earlier.

I might be missing something, but this is different syntax than what you have in the model call. What happens if you use exactly the same call with brm, i.e.:

brms::brm(
  brms::bf(Resp_var ~ b -((Time_var/b1)^b2), 
           nonlinear= list(b ~ (Time_var|Gr_var) + CoVar:Gr_var + (Gr_var|Id_var),
                           b1 ~ 1, b2 ~ 1),
           nl=TRUE),
  data= dat,
  family = gaussian())

Fake data would definitely help, this stuff is hard to debug (at least for me) just by reading.

Best of luck with your model!

Rokh42 · March 5, 2021, 9:25pm

martinmodrak:

brms::brm(
  brms::bf(Resp_var ~ b -((Time_var/b1)^b2), 
           nonlinear= list(b ~ (Time_var|Gr_var) + CoVar:Gr_var + (Gr_var|Id_var),
                           b1 ~ 1, b2 ~ 1),
           nl=TRUE),
  data= dat,
  family = gaussian())

Hi Martin, Thanks for the reply- I know it can be very hard to debug these things without data. I found your solution really interesting because it immediately through an error for me:

Error: The following priors do not correspond to any model parameter:
b_b_baseline ~ (flat)
b_b_Gr2 ~ (flat)
b_b_Gr2:baseline ~ (flat)
b_b_Gr3 ~ (flat)
b_b_Gr3:baseline ~ (flat)
b_b_Gr4 ~ (flat)
b_b_Gr4:baseline ~ (flat)

I was actually able to get my model to run using a different syntax:

Fit1<- brm(
  bf(Resp_var ~ b - (Time_var/ b1) ^ (b2),
     b ~ 1 + (Time_var|Gr_var)+ Gr_var* baseline + (Gr_var| ID_var),
     b1 ~ 1 + (1 | Gr_var),
     b2 ~ 1,
     nl = TRUE),
  data = dat, family = gaussian(),
  prior=c(set_prior("normal(0,20)", nlpar="b"),
          set_prior("uniform(0,1)", nlpar="b2", lb=0, ub=1),
          set_prior("cauchy(0,10)", nlpar="b1", ub=0)),
  control = list(adapt_delta = 0.9),
  file = "myfit")

Now this worked but it brought up several errors while compiling, all of which are linked to those parameters, errors such as:

Chain 1: Error evaluating the log probability at the initial value.
Chain 1: Exception: normal_lpdf: Location parameter[7] is nan, but must be finite! (in ‘model2e6c38437900_ae98191816eed43593c275b856b260d7’ at line 132)
Chain 1: Rejecting initial value:
Chain 1: Error evaluating the log probability at the initial value.
Chain 1: Exception: normal_lpdf: Location parameter[2] is nan, but must be finite! (in ‘model2e6c38437900_ae98191816eed43593c275b856b260d7’ at line 132)
Chain 1: Rejecting initial value:
Chain 1: Error evaluating the log probability at the initial value.
Chain 1: Exception: normal_lpdf: Location parameter[2] is nan, but must be finite! (in ‘model2e6c38437900_ae98191816eed43593c275b856b260d7’ at line 132)

I read through all the forum posts on possible solutions to this error, such as inits=0, init_r=0.1, trying different priors- but what was odd was when I tried better priors (the ones I used here are pretty lousy), then the model wouldn’t run at all.

Unfortunately, I don’t know Stan (I’m trying to learn to debug), but I did turn my model into stan code using the make_stancode function and the results were:

// generated with brms 2.14.4
functions {
  /* turn a vector into a matrix of defined dimension 
   * stores elements in row major order
   * Args: 
   *   X: a vector 
   *   N: first dimension of the desired matrix
   *   K: second dimension of the desired matrix 
   * Returns: 
   *   a matrix of dimension N x K 
   */ 
  matrix as_matrix(vector X, int N, int K) { 
    matrix[N, K] Y; 
    for (i in 1:N) {
      Y[i] = to_row_vector(X[((i - 1) * K + 1):(i * K)]); 
    }
    return Y; 
  } 
 /* compute correlated group-level effects
  * Args: 
  *   z: matrix of unscaled group-level effects
  *   SD: vector of standard deviation parameters
  *   L: cholesky factor correlation matrix
  * Returns: 
  *   matrix of scaled group-level effects
  */ 
  matrix scale_r_cor(matrix z, vector SD, matrix L) {
    // r is stored in another dimension order than z
    return transpose(diag_pre_multiply(SD, L) * z);
  }
}
data {
  int<lower=1> N;  // total number of observations
  vector[N] Y;  // response variable
  int<lower=1> K_b;  // number of population-level effects
  matrix[N, K_b] X_b;  // population-level design matrix
  int<lower=1> K_b1;  // number of population-level effects
  matrix[N, K_b1] X_b1;  // population-level design matrix
  int<lower=1> K_b2;  // number of population-level effects
  matrix[N, K_b2] X_b2;  // population-level design matrix
  // covariate vectors for non-linear functions
  vector[N] C_1;
  // data for group-level effects of ID 1
  int<lower=1> N_1;  // number of grouping levels
  int<lower=1> M_1;  // number of coefficients per level
  int<lower=1> J_1[N];  // grouping indicator per observation
  // group-level predictor values
  vector[N] Z_1_b_1;
  vector[N] Z_1_b_2;
  int<lower=1> NC_1;  // number of group-level correlations
  // data for group-level effects of ID 2
  int<lower=1> N_2;  // number of grouping levels
  int<lower=1> M_2;  // number of coefficients per level
  int<lower=1> J_2[N];  // grouping indicator per observation
  // group-level predictor values
  vector[N] Z_2_b_1;
  vector[N] Z_2_b_2;
  vector[N] Z_2_b_3;
  vector[N] Z_2_b_4;
  int<lower=1> NC_2;  // number of group-level correlations
  // data for group-level effects of ID 3
  int<lower=1> N_3;  // number of grouping levels
  int<lower=1> M_3;  // number of coefficients per level
  int<lower=1> J_3[N];  // grouping indicator per observation
  // group-level predictor values
  vector[N] Z_3_b1_1;
  int prior_only;  // should the likelihood be ignored?
}
transformed data {
}
parameters {
  vector[K_b] b_b;  // population-level effects
  vector<upper=0>[K_b1] b_b1;  // population-level effects
  vector<lower=0,upper=1>[K_b2] b_b2;  // population-level effects
  real<lower=0> sigma;  // residual SD
  vector<lower=0>[M_1] sd_1;  // group-level standard deviations
  matrix[M_1, N_1] z_1;  // standardized group-level effects
  cholesky_factor_corr[M_1] L_1;  // cholesky factor of correlation matrix
  vector<lower=0>[M_2] sd_2;  // group-level standard deviations
  matrix[M_2, N_2] z_2;  // standardized group-level effects
  cholesky_factor_corr[M_2] L_2;  // cholesky factor of correlation matrix
  vector<lower=0>[M_3] sd_3;  // group-level standard deviations
  vector[N_3] z_3[M_3];  // standardized group-level effects
}
transformed parameters {
  matrix[N_1, M_1] r_1;  // actual group-level effects
  // using vectors speeds up indexing in loops
  vector[N_1] r_1_b_1;
  vector[N_1] r_1_b_2;
  matrix[N_2, M_2] r_2;  // actual group-level effects
  // using vectors speeds up indexing in loops
  vector[N_2] r_2_b_1;
  vector[N_2] r_2_b_2;
  vector[N_2] r_2_b_3;
  vector[N_2] r_2_b_4;
  vector[N_3] r_3_b1_1;  // actual group-level effects
  // compute actual group-level effects
  r_1 = scale_r_cor(z_1, sd_1, L_1);
  r_1_b_1 = r_1[, 1];
  r_1_b_2 = r_1[, 2];
  // compute actual group-level effects
  r_2 = scale_r_cor(z_2, sd_2, L_2);
  r_2_b_1 = r_2[, 1];
  r_2_b_2 = r_2[, 2];
  r_2_b_3 = r_2[, 3];
  r_2_b_4 = r_2[, 4];
  r_3_b1_1 = (sd_3[1] * (z_3[1]));
}
model {
  // likelihood including all constants
  if (!prior_only) {
    // initialize linear predictor term
    vector[N] nlp_b = X_b * b_b;
    // initialize linear predictor term
    vector[N] nlp_b1 = X_b1 * b_b1;
    // initialize linear predictor term
    vector[N] nlp_b2 = X_b2 * b_b2;
    // initialize non-linear predictor term
    vector[N] mu;
    for (n in 1:N) {
      // add more terms to the linear predictor
      nlp_b[n] += r_1_b_1[J_1[n]] * Z_1_b_1[n] + r_1_b_2[J_1[n]] * Z_1_b_2[n] + r_2_b_1[J_2[n]] * Z_2_b_1[n] + r_2_b_2[J_2[n]] * Z_2_b_2[n] + r_2_b_3[J_2[n]] * Z_2_b_3[n] + r_2_b_4[J_2[n]] * Z_2_b_4[n];
    }
    for (n in 1:N) {
      // add more terms to the linear predictor
      nlp_b1[n] += r_3_b1_1[J_3[n]] * Z_3_b1_1[n];
    }
    for (n in 1:N) {
      // compute non-linear predictor values
      mu[n] = nlp_b[n] - ((C_1[n] / nlp_b1[n]) ^ nlp_b2[n]);
    }
    target += normal_lpdf(Y | mu, sigma);
  }
  // priors including all constants
  target += normal_lpdf(b_b | 0,20);
  target += cauchy_lpdf(b_b1 | 0,10)
    - 1 * cauchy_lcdf(0 | 0,10);
  target += uniform_lpdf(b_b2 | 0,1)
    - 1 * log_diff_exp(uniform_lcdf(1 | 0,1), uniform_lcdf(0 | 0,1));
  target += student_t_lpdf(sigma | 3, 0, 2.5)
    - 1 * student_t_lccdf(0 | 3, 0, 2.5);
  target += student_t_lpdf(sd_1 | 3, 0, 2.5)
    - 2 * student_t_lccdf(0 | 3, 0, 2.5);
  target += std_normal_lpdf(to_vector(z_1));
  target += lkj_corr_cholesky_lpdf(L_1 | 1);
  target += student_t_lpdf(sd_2 | 3, 0, 2.5)
    - 4 * student_t_lccdf(0 | 3, 0, 2.5);
  target += std_normal_lpdf(to_vector(z_2));
  target += lkj_corr_cholesky_lpdf(L_2 | 1);
  target += student_t_lpdf(sd_3 | 3, 0, 2.5)
    - 1 * student_t_lccdf(0 | 3, 0, 2.5);
  target += std_normal_lpdf(z_3[1]);
}
generated quantities {
  // compute group-level correlations
  corr_matrix[M_1] Cor_1 = multiply_lower_tri_self_transpose(L_1);
  vector<lower=-1,upper=1>[NC_1] cor_1;
  // compute group-level correlations
  corr_matrix[M_2] Cor_2 = multiply_lower_tri_self_transpose(L_2);
  vector<lower=-1,upper=1>[NC_2] cor_2;
  // extract upper diagonal of correlation matrix
  for (k in 1:M_1) {
    for (j in 1:(k - 1)) {
      cor_1[choose(k - 1, 2) + j] = Cor_1[j, k];
    }
  }
  // extract upper diagonal of correlation matrix
  for (k in 1:M_2) {
    for (j in 1:(k - 1)) {
      cor_2[choose(k - 1, 2) + j] = Cor_2[j, k];
    }
  }
}

I apologize for the lack of numbers, I finally figured out how to embed them and the offending line is:

target += normal_lpdf(Y | mu, sigma);

I am going to see what I can learn about Stan and see how I can debug that but any insights would be greatly appreciated! Thank you again!

martinmodrak · March 11, 2021, 10:21am

Just for precision: those errors are not from compiling (which encompass compilation of Stan to C++ and the C++ code to an exacutable) but from running the model. If you get a few of those, it means your model is a bit difficult to get started, but if it ends up running and giving you good results with good diagnostics, you may ignore those. They are however useful to let you know that something is a bit problematic and it is good to pay attention to those if you plan to extend the model further as small problems could get amplified as you expand the model.

This is a bit non-specific. What do you mean? Do you get an error? Or that it starts sampling but then gets stuck and takes forever before showing additional progress?

To me, the most likely issue is the exponentiation part of your model

This value is only well defined for either b2 of integer value (which you almost certainly don’t want) or (Time_var/ b1) positive. Since b1 is constrained to be negative (isn’t that a bug?) you only get valid values when Time_var is negative. And that’s avoiding the complications from the group term on b1.

Assuming Time_var is always positive than a good way to ensure valid values (and allow grouping structure on b1) would be to avoid any bounds on b1 and instead have

Resp_var ~ b - (Time_var/ exp(b1)) ^ (b2)

Although this changes the model a bit.

Does that make sense?

Topic		Replies	Views
Syntax/inferences for nested non-linear MLM brms	9	1858	June 19, 2019
Help with brms non-linear syntax brms	3	221	March 5, 2024
Brms ID-syntax in multivariate model brms	9	827	September 2, 2019
Specify NLME model in brms Modeling	3	776	July 13, 2020
Understanding brms syntax using \|<ID>\| and how that affects correlation brms techniques , specification , brms	0	493	May 11, 2023

MLM with NL component - difficulty with syntax

Related topics