MLM with NL component - difficulty with syntax

Rokh42 · March 5, 2021, 9:25pm

martinmodrak:

brms::brm(
  brms::bf(Resp_var ~ b -((Time_var/b1)^b2), 
           nonlinear= list(b ~ (Time_var|Gr_var) + CoVar:Gr_var + (Gr_var|Id_var),
                           b1 ~ 1, b2 ~ 1),
           nl=TRUE),
  data= dat,
  family = gaussian())

Hi Martin, Thanks for the reply- I know it can be very hard to debug these things without data. I found your solution really interesting because it immediately through an error for me:

Error: The following priors do not correspond to any model parameter:
b_b_baseline ~ (flat)
b_b_Gr2 ~ (flat)
b_b_Gr2:baseline ~ (flat)
b_b_Gr3 ~ (flat)
b_b_Gr3:baseline ~ (flat)
b_b_Gr4 ~ (flat)
b_b_Gr4:baseline ~ (flat)

I was actually able to get my model to run using a different syntax:

Fit1<- brm(
  bf(Resp_var ~ b - (Time_var/ b1) ^ (b2),
     b ~ 1 + (Time_var|Gr_var)+ Gr_var* baseline + (Gr_var| ID_var),
     b1 ~ 1 + (1 | Gr_var),
     b2 ~ 1,
     nl = TRUE),
  data = dat, family = gaussian(),
  prior=c(set_prior("normal(0,20)", nlpar="b"),
          set_prior("uniform(0,1)", nlpar="b2", lb=0, ub=1),
          set_prior("cauchy(0,10)", nlpar="b1", ub=0)),
  control = list(adapt_delta = 0.9),
  file = "myfit")

Now this worked but it brought up several errors while compiling, all of which are linked to those parameters, errors such as:

Chain 1: Error evaluating the log probability at the initial value.
Chain 1: Exception: normal_lpdf: Location parameter[7] is nan, but must be finite! (in ‘model2e6c38437900_ae98191816eed43593c275b856b260d7’ at line 132)
Chain 1: Rejecting initial value:
Chain 1: Error evaluating the log probability at the initial value.
Chain 1: Exception: normal_lpdf: Location parameter[2] is nan, but must be finite! (in ‘model2e6c38437900_ae98191816eed43593c275b856b260d7’ at line 132)
Chain 1: Rejecting initial value:
Chain 1: Error evaluating the log probability at the initial value.
Chain 1: Exception: normal_lpdf: Location parameter[2] is nan, but must be finite! (in ‘model2e6c38437900_ae98191816eed43593c275b856b260d7’ at line 132)

I read through all the forum posts on possible solutions to this error, such as inits=0, init_r=0.1, trying different priors- but what was odd was when I tried better priors (the ones I used here are pretty lousy), then the model wouldn’t run at all.

Unfortunately, I don’t know Stan (I’m trying to learn to debug), but I did turn my model into stan code using the make_stancode function and the results were:

// generated with brms 2.14.4
functions {
  /* turn a vector into a matrix of defined dimension 
   * stores elements in row major order
   * Args: 
   *   X: a vector 
   *   N: first dimension of the desired matrix
   *   K: second dimension of the desired matrix 
   * Returns: 
   *   a matrix of dimension N x K 
   */ 
  matrix as_matrix(vector X, int N, int K) { 
    matrix[N, K] Y; 
    for (i in 1:N) {
      Y[i] = to_row_vector(X[((i - 1) * K + 1):(i * K)]); 
    }
    return Y; 
  } 
 /* compute correlated group-level effects
  * Args: 
  *   z: matrix of unscaled group-level effects
  *   SD: vector of standard deviation parameters
  *   L: cholesky factor correlation matrix
  * Returns: 
  *   matrix of scaled group-level effects
  */ 
  matrix scale_r_cor(matrix z, vector SD, matrix L) {
    // r is stored in another dimension order than z
    return transpose(diag_pre_multiply(SD, L) * z);
  }
}
data {
  int<lower=1> N;  // total number of observations
  vector[N] Y;  // response variable
  int<lower=1> K_b;  // number of population-level effects
  matrix[N, K_b] X_b;  // population-level design matrix
  int<lower=1> K_b1;  // number of population-level effects
  matrix[N, K_b1] X_b1;  // population-level design matrix
  int<lower=1> K_b2;  // number of population-level effects
  matrix[N, K_b2] X_b2;  // population-level design matrix
  // covariate vectors for non-linear functions
  vector[N] C_1;
  // data for group-level effects of ID 1
  int<lower=1> N_1;  // number of grouping levels
  int<lower=1> M_1;  // number of coefficients per level
  int<lower=1> J_1[N];  // grouping indicator per observation
  // group-level predictor values
  vector[N] Z_1_b_1;
  vector[N] Z_1_b_2;
  int<lower=1> NC_1;  // number of group-level correlations
  // data for group-level effects of ID 2
  int<lower=1> N_2;  // number of grouping levels
  int<lower=1> M_2;  // number of coefficients per level
  int<lower=1> J_2[N];  // grouping indicator per observation
  // group-level predictor values
  vector[N] Z_2_b_1;
  vector[N] Z_2_b_2;
  vector[N] Z_2_b_3;
  vector[N] Z_2_b_4;
  int<lower=1> NC_2;  // number of group-level correlations
  // data for group-level effects of ID 3
  int<lower=1> N_3;  // number of grouping levels
  int<lower=1> M_3;  // number of coefficients per level
  int<lower=1> J_3[N];  // grouping indicator per observation
  // group-level predictor values
  vector[N] Z_3_b1_1;
  int prior_only;  // should the likelihood be ignored?
}
transformed data {
}
parameters {
  vector[K_b] b_b;  // population-level effects
  vector<upper=0>[K_b1] b_b1;  // population-level effects
  vector<lower=0,upper=1>[K_b2] b_b2;  // population-level effects
  real<lower=0> sigma;  // residual SD
  vector<lower=0>[M_1] sd_1;  // group-level standard deviations
  matrix[M_1, N_1] z_1;  // standardized group-level effects
  cholesky_factor_corr[M_1] L_1;  // cholesky factor of correlation matrix
  vector<lower=0>[M_2] sd_2;  // group-level standard deviations
  matrix[M_2, N_2] z_2;  // standardized group-level effects
  cholesky_factor_corr[M_2] L_2;  // cholesky factor of correlation matrix
  vector<lower=0>[M_3] sd_3;  // group-level standard deviations
  vector[N_3] z_3[M_3];  // standardized group-level effects
}
transformed parameters {
  matrix[N_1, M_1] r_1;  // actual group-level effects
  // using vectors speeds up indexing in loops
  vector[N_1] r_1_b_1;
  vector[N_1] r_1_b_2;
  matrix[N_2, M_2] r_2;  // actual group-level effects
  // using vectors speeds up indexing in loops
  vector[N_2] r_2_b_1;
  vector[N_2] r_2_b_2;
  vector[N_2] r_2_b_3;
  vector[N_2] r_2_b_4;
  vector[N_3] r_3_b1_1;  // actual group-level effects
  // compute actual group-level effects
  r_1 = scale_r_cor(z_1, sd_1, L_1);
  r_1_b_1 = r_1[, 1];
  r_1_b_2 = r_1[, 2];
  // compute actual group-level effects
  r_2 = scale_r_cor(z_2, sd_2, L_2);
  r_2_b_1 = r_2[, 1];
  r_2_b_2 = r_2[, 2];
  r_2_b_3 = r_2[, 3];
  r_2_b_4 = r_2[, 4];
  r_3_b1_1 = (sd_3[1] * (z_3[1]));
}
model {
  // likelihood including all constants
  if (!prior_only) {
    // initialize linear predictor term
    vector[N] nlp_b = X_b * b_b;
    // initialize linear predictor term
    vector[N] nlp_b1 = X_b1 * b_b1;
    // initialize linear predictor term
    vector[N] nlp_b2 = X_b2 * b_b2;
    // initialize non-linear predictor term
    vector[N] mu;
    for (n in 1:N) {
      // add more terms to the linear predictor
      nlp_b[n] += r_1_b_1[J_1[n]] * Z_1_b_1[n] + r_1_b_2[J_1[n]] * Z_1_b_2[n] + r_2_b_1[J_2[n]] * Z_2_b_1[n] + r_2_b_2[J_2[n]] * Z_2_b_2[n] + r_2_b_3[J_2[n]] * Z_2_b_3[n] + r_2_b_4[J_2[n]] * Z_2_b_4[n];
    }
    for (n in 1:N) {
      // add more terms to the linear predictor
      nlp_b1[n] += r_3_b1_1[J_3[n]] * Z_3_b1_1[n];
    }
    for (n in 1:N) {
      // compute non-linear predictor values
      mu[n] = nlp_b[n] - ((C_1[n] / nlp_b1[n]) ^ nlp_b2[n]);
    }
    target += normal_lpdf(Y | mu, sigma);
  }
  // priors including all constants
  target += normal_lpdf(b_b | 0,20);
  target += cauchy_lpdf(b_b1 | 0,10)
    - 1 * cauchy_lcdf(0 | 0,10);
  target += uniform_lpdf(b_b2 | 0,1)
    - 1 * log_diff_exp(uniform_lcdf(1 | 0,1), uniform_lcdf(0 | 0,1));
  target += student_t_lpdf(sigma | 3, 0, 2.5)
    - 1 * student_t_lccdf(0 | 3, 0, 2.5);
  target += student_t_lpdf(sd_1 | 3, 0, 2.5)
    - 2 * student_t_lccdf(0 | 3, 0, 2.5);
  target += std_normal_lpdf(to_vector(z_1));
  target += lkj_corr_cholesky_lpdf(L_1 | 1);
  target += student_t_lpdf(sd_2 | 3, 0, 2.5)
    - 4 * student_t_lccdf(0 | 3, 0, 2.5);
  target += std_normal_lpdf(to_vector(z_2));
  target += lkj_corr_cholesky_lpdf(L_2 | 1);
  target += student_t_lpdf(sd_3 | 3, 0, 2.5)
    - 1 * student_t_lccdf(0 | 3, 0, 2.5);
  target += std_normal_lpdf(z_3[1]);
}
generated quantities {
  // compute group-level correlations
  corr_matrix[M_1] Cor_1 = multiply_lower_tri_self_transpose(L_1);
  vector<lower=-1,upper=1>[NC_1] cor_1;
  // compute group-level correlations
  corr_matrix[M_2] Cor_2 = multiply_lower_tri_self_transpose(L_2);
  vector<lower=-1,upper=1>[NC_2] cor_2;
  // extract upper diagonal of correlation matrix
  for (k in 1:M_1) {
    for (j in 1:(k - 1)) {
      cor_1[choose(k - 1, 2) + j] = Cor_1[j, k];
    }
  }
  // extract upper diagonal of correlation matrix
  for (k in 1:M_2) {
    for (j in 1:(k - 1)) {
      cor_2[choose(k - 1, 2) + j] = Cor_2[j, k];
    }
  }
}

I apologize for the lack of numbers, I finally figured out how to embed them and the offending line is:

target += normal_lpdf(Y | mu, sigma);

I am going to see what I can learn about Stan and see how I can debug that but any insights would be greatly appreciated! Thank you again!

Topic		Replies	Views
Syntax/inferences for nested non-linear MLM brms	9	1898	June 19, 2019
Autoregression on non-linear model parameters in brms brms	16	1594	September 20, 2018
Brms multilevel nonlinear model - model fails to converge when converted from individual level to multilevel brms	4	2328	August 2, 2018
Measurement error in nonlinear mixed effect models brms techniques , specification	13	1288	August 30, 2020
Non-linear specification (variable is not a valid distributional or non-linear parameter) brms specification	10	2136	July 19, 2021

MLM with NL component - difficulty with syntax

Related topics