Hi, I’ve spent the past week going through all these forum posts, vignettes, google etc and still cannot find an answer for my problem. I’m hoping someone can help clarify.
I am building a mixed MLM model- the linear component of the model is already working. My data consists of 1 response variable (Resp_var, continuous), subject ID (ID_var, factor), grouping variable (Group_var, 4 level factor), a time variable (Time_var, continuous), and another covariate (CoVar, continuous but unique to each subject as a baseline value, doesn’t depend on time). My linear MLM syntax is:
Mod <- brm(Resp_var ~ (Time_var|Group_var) +
(CoVar* Group_var) + (Group_var|Id_var),
data = dat)
My two intercept represent (Time_var|Group_var) : the s random Group effect for time parameter and (Group_var|Id_var): the subjects random effects per Group.
Based on prior research/knowledge of the subject we’re studying, we know we need to add on a nonlinear piece to accurately capture the Resp_var Similar to other research performed on this topic, the nonlinear component is modeled as a Weibull decay curve in this format (t/λ)^k, where t= Time_var), λ = scale param, k = shape param. I don’t know the values of λ (but due to the decay <0), and k (except 0<k<1). So the NL component I wish to add on is (newtime / lambda) ^ kappa.
I’ve searched the forums and particular these github links: 46, 47 where @paul.buerkner explains how to incorporate a NL component into a linear model, and followed the examples, but none of the syntaxes I’ve tried have worked. Because my Grouping variable is a 4 level factor, I know I can only put in into the nonlinear listing, but I keep getting errors.
Here is one syntax I tried, going off of the information on those two links (where b1=lambda, b2= kappa)
I might be missing something, but this is different syntax than what you have in the model call. What happens if you use exactly the same call with brm, i.e.:
Hi Martin, Thanks for the reply- I know it can be very hard to debug these things without data. I found your solution really interesting because it immediately through an error for me:
Error: The following priors do not correspond to any model parameter:
b_b_baseline ~ (flat)
b_b_Gr2 ~ (flat)
b_b_Gr2:baseline ~ (flat)
b_b_Gr3 ~ (flat)
b_b_Gr3:baseline ~ (flat)
b_b_Gr4 ~ (flat)
b_b_Gr4:baseline ~ (flat)
I was actually able to get my model to run using a different syntax:
Now this worked but it brought up several errors while compiling, all of which are linked to those parameters, errors such as:
Chain 1: Error evaluating the log probability at the initial value.
Chain 1: Exception: normal_lpdf: Location parameter[7] is nan, but must be finite! (in ‘model2e6c38437900_ae98191816eed43593c275b856b260d7’ at line 132)
Chain 1: Rejecting initial value:
Chain 1: Error evaluating the log probability at the initial value.
Chain 1: Exception: normal_lpdf: Location parameter[2] is nan, but must be finite! (in ‘model2e6c38437900_ae98191816eed43593c275b856b260d7’ at line 132)
Chain 1: Rejecting initial value:
Chain 1: Error evaluating the log probability at the initial value.
Chain 1: Exception: normal_lpdf: Location parameter[2] is nan, but must be finite! (in ‘model2e6c38437900_ae98191816eed43593c275b856b260d7’ at line 132)
I read through all the forum posts on possible solutions to this error, such as inits=0, init_r=0.1, trying different priors- but what was odd was when I tried better priors (the ones I used here are pretty lousy), then the model wouldn’t run at all.
Unfortunately, I don’t know Stan (I’m trying to learn to debug), but I did turn my model into stan code using the make_stancode function and the results were:
// generated with brms 2.14.4
functions {
/* turn a vector into a matrix of defined dimension
* stores elements in row major order
* Args:
* X: a vector
* N: first dimension of the desired matrix
* K: second dimension of the desired matrix
* Returns:
* a matrix of dimension N x K
*/
matrix as_matrix(vector X, int N, int K) {
matrix[N, K] Y;
for (i in 1:N) {
Y[i] = to_row_vector(X[((i - 1) * K + 1):(i * K)]);
}
return Y;
}
/* compute correlated group-level effects
* Args:
* z: matrix of unscaled group-level effects
* SD: vector of standard deviation parameters
* L: cholesky factor correlation matrix
* Returns:
* matrix of scaled group-level effects
*/
matrix scale_r_cor(matrix z, vector SD, matrix L) {
// r is stored in another dimension order than z
return transpose(diag_pre_multiply(SD, L) * z);
}
}
data {
int<lower=1> N; // total number of observations
vector[N] Y; // response variable
int<lower=1> K_b; // number of population-level effects
matrix[N, K_b] X_b; // population-level design matrix
int<lower=1> K_b1; // number of population-level effects
matrix[N, K_b1] X_b1; // population-level design matrix
int<lower=1> K_b2; // number of population-level effects
matrix[N, K_b2] X_b2; // population-level design matrix
// covariate vectors for non-linear functions
vector[N] C_1;
// data for group-level effects of ID 1
int<lower=1> N_1; // number of grouping levels
int<lower=1> M_1; // number of coefficients per level
int<lower=1> J_1[N]; // grouping indicator per observation
// group-level predictor values
vector[N] Z_1_b_1;
vector[N] Z_1_b_2;
int<lower=1> NC_1; // number of group-level correlations
// data for group-level effects of ID 2
int<lower=1> N_2; // number of grouping levels
int<lower=1> M_2; // number of coefficients per level
int<lower=1> J_2[N]; // grouping indicator per observation
// group-level predictor values
vector[N] Z_2_b_1;
vector[N] Z_2_b_2;
vector[N] Z_2_b_3;
vector[N] Z_2_b_4;
int<lower=1> NC_2; // number of group-level correlations
// data for group-level effects of ID 3
int<lower=1> N_3; // number of grouping levels
int<lower=1> M_3; // number of coefficients per level
int<lower=1> J_3[N]; // grouping indicator per observation
// group-level predictor values
vector[N] Z_3_b1_1;
int prior_only; // should the likelihood be ignored?
}
transformed data {
}
parameters {
vector[K_b] b_b; // population-level effects
vector<upper=0>[K_b1] b_b1; // population-level effects
vector<lower=0,upper=1>[K_b2] b_b2; // population-level effects
real<lower=0> sigma; // residual SD
vector<lower=0>[M_1] sd_1; // group-level standard deviations
matrix[M_1, N_1] z_1; // standardized group-level effects
cholesky_factor_corr[M_1] L_1; // cholesky factor of correlation matrix
vector<lower=0>[M_2] sd_2; // group-level standard deviations
matrix[M_2, N_2] z_2; // standardized group-level effects
cholesky_factor_corr[M_2] L_2; // cholesky factor of correlation matrix
vector<lower=0>[M_3] sd_3; // group-level standard deviations
vector[N_3] z_3[M_3]; // standardized group-level effects
}
transformed parameters {
matrix[N_1, M_1] r_1; // actual group-level effects
// using vectors speeds up indexing in loops
vector[N_1] r_1_b_1;
vector[N_1] r_1_b_2;
matrix[N_2, M_2] r_2; // actual group-level effects
// using vectors speeds up indexing in loops
vector[N_2] r_2_b_1;
vector[N_2] r_2_b_2;
vector[N_2] r_2_b_3;
vector[N_2] r_2_b_4;
vector[N_3] r_3_b1_1; // actual group-level effects
// compute actual group-level effects
r_1 = scale_r_cor(z_1, sd_1, L_1);
r_1_b_1 = r_1[, 1];
r_1_b_2 = r_1[, 2];
// compute actual group-level effects
r_2 = scale_r_cor(z_2, sd_2, L_2);
r_2_b_1 = r_2[, 1];
r_2_b_2 = r_2[, 2];
r_2_b_3 = r_2[, 3];
r_2_b_4 = r_2[, 4];
r_3_b1_1 = (sd_3[1] * (z_3[1]));
}
model {
// likelihood including all constants
if (!prior_only) {
// initialize linear predictor term
vector[N] nlp_b = X_b * b_b;
// initialize linear predictor term
vector[N] nlp_b1 = X_b1 * b_b1;
// initialize linear predictor term
vector[N] nlp_b2 = X_b2 * b_b2;
// initialize non-linear predictor term
vector[N] mu;
for (n in 1:N) {
// add more terms to the linear predictor
nlp_b[n] += r_1_b_1[J_1[n]] * Z_1_b_1[n] + r_1_b_2[J_1[n]] * Z_1_b_2[n] + r_2_b_1[J_2[n]] * Z_2_b_1[n] + r_2_b_2[J_2[n]] * Z_2_b_2[n] + r_2_b_3[J_2[n]] * Z_2_b_3[n] + r_2_b_4[J_2[n]] * Z_2_b_4[n];
}
for (n in 1:N) {
// add more terms to the linear predictor
nlp_b1[n] += r_3_b1_1[J_3[n]] * Z_3_b1_1[n];
}
for (n in 1:N) {
// compute non-linear predictor values
mu[n] = nlp_b[n] - ((C_1[n] / nlp_b1[n]) ^ nlp_b2[n]);
}
target += normal_lpdf(Y | mu, sigma);
}
// priors including all constants
target += normal_lpdf(b_b | 0,20);
target += cauchy_lpdf(b_b1 | 0,10)
- 1 * cauchy_lcdf(0 | 0,10);
target += uniform_lpdf(b_b2 | 0,1)
- 1 * log_diff_exp(uniform_lcdf(1 | 0,1), uniform_lcdf(0 | 0,1));
target += student_t_lpdf(sigma | 3, 0, 2.5)
- 1 * student_t_lccdf(0 | 3, 0, 2.5);
target += student_t_lpdf(sd_1 | 3, 0, 2.5)
- 2 * student_t_lccdf(0 | 3, 0, 2.5);
target += std_normal_lpdf(to_vector(z_1));
target += lkj_corr_cholesky_lpdf(L_1 | 1);
target += student_t_lpdf(sd_2 | 3, 0, 2.5)
- 4 * student_t_lccdf(0 | 3, 0, 2.5);
target += std_normal_lpdf(to_vector(z_2));
target += lkj_corr_cholesky_lpdf(L_2 | 1);
target += student_t_lpdf(sd_3 | 3, 0, 2.5)
- 1 * student_t_lccdf(0 | 3, 0, 2.5);
target += std_normal_lpdf(z_3[1]);
}
generated quantities {
// compute group-level correlations
corr_matrix[M_1] Cor_1 = multiply_lower_tri_self_transpose(L_1);
vector<lower=-1,upper=1>[NC_1] cor_1;
// compute group-level correlations
corr_matrix[M_2] Cor_2 = multiply_lower_tri_self_transpose(L_2);
vector<lower=-1,upper=1>[NC_2] cor_2;
// extract upper diagonal of correlation matrix
for (k in 1:M_1) {
for (j in 1:(k - 1)) {
cor_1[choose(k - 1, 2) + j] = Cor_1[j, k];
}
}
// extract upper diagonal of correlation matrix
for (k in 1:M_2) {
for (j in 1:(k - 1)) {
cor_2[choose(k - 1, 2) + j] = Cor_2[j, k];
}
}
}
I apologize for the lack of numbers, I finally figured out how to embed them and the offending line is:
target += normal_lpdf(Y | mu, sigma);
I am going to see what I can learn about Stan and see how I can debug that but any insights would be greatly appreciated! Thank you again!
Just for precision: those errors are not from compiling (which encompass compilation of Stan to C++ and the C++ code to an exacutable) but from running the model. If you get a few of those, it means your model is a bit difficult to get started, but if it ends up running and giving you good results with good diagnostics, you may ignore those. They are however useful to let you know that something is a bit problematic and it is good to pay attention to those if you plan to extend the model further as small problems could get amplified as you expand the model.
This is a bit non-specific. What do you mean? Do you get an error? Or that it starts sampling but then gets stuck and takes forever before showing additional progress?
To me, the most likely issue is the exponentiation part of your model
This value is only well defined for either b2 of integer value (which you almost certainly don’t want) or (Time_var/ b1) positive. Since b1 is constrained to be negative (isn’t that a bug?) you only get valid values when Time_var is negative. And that’s avoiding the complications from the group term on b1.
Assuming Time_var is always positive than a good way to ensure valid values (and allow grouping structure on b1) would be to avoid any bounds on b1 and instead have