Help with notation for hierarchical gam in BRMS

sea83 · March 30, 2023, 10:13pm

I have been asked to include the notation for a model in the supplement of a paper we are submitting. I have come up with my best attempt at notating what we fit in brms, but I would greatly appreciate if anyone is able to either verify that what I have is correct or suggest how I can improve what I have.

The brms syntax to fit the model is

dn15_result_PREV_KCAL<-brm(dn15_result~s(fai) + class_focal + s(prev_day_total_kcal_using_ap_low_fermentation) + s(isotope_time_collected) + (1 +  fai + prev_day_total_kcal_using_ap_low_fermentation + isotope_time_collected  | name_focal), 
                           data = data[complete.cases(data[,c("dn15_result","isotope_time_collected","fai","prev_day_total_kcal_using_ap_low_fermentation")]),],
                           family=gaussian(),iter = 2000, 
                           prior=c(prior(normal(0,10),class="Intercept"),
                                   prior(normal(0,10),class="b"),
                                   prior(normal(0,10),class="sigma"),
                                   prior(normal(0,10),class="sd")),
                           control = list(max_treedepth = 11, adapt_delta = .999))

The stan code from this model is:

// generated with brms 2.16.3
functions {
 /* compute correlated group-level effects
  * Args: 
  *   z: matrix of unscaled group-level effects
  *   SD: vector of standard deviation parameters
  *   L: cholesky factor correlation matrix
  * Returns: 
  *   matrix of scaled group-level effects
  */ 
  matrix scale_r_cor(matrix z, vector SD, matrix L) {
    // r is stored in another dimension order than z
    return transpose(diag_pre_multiply(SD, L) * z);
  }
}
data {
  int<lower=1> N;  // total number of observations
  vector[N] Y;  // response variable
  int<lower=1> K;  // number of population-level effects
  matrix[N, K] X;  // population-level design matrix
  // data for splines
  int Ks;  // number of linear effects
  matrix[N, Ks] Xs;  // design matrix for the linear effects
  // data for spline s(fai)
  int nb_1;  // number of bases
  int knots_1[nb_1];  // number of knots
  // basis function matrices
  matrix[N, knots_1[1]] Zs_1_1;
  // data for spline s(prev_day_total_kcal_using_ap_low_fermentation)
  int nb_2;  // number of bases
  int knots_2[nb_2];  // number of knots
  // basis function matrices
  matrix[N, knots_2[1]] Zs_2_1;
  // data for spline s(isotope_time_collected)
  int nb_3;  // number of bases
  int knots_3[nb_3];  // number of knots
  // basis function matrices
  matrix[N, knots_3[1]] Zs_3_1;
  // data for group-level effects of ID 1
  int<lower=1> N_1;  // number of grouping levels
  int<lower=1> M_1;  // number of coefficients per level
  int<lower=1> J_1[N];  // grouping indicator per observation
  // group-level predictor values
  vector[N] Z_1_1;
  vector[N] Z_1_2;
  vector[N] Z_1_3;
  vector[N] Z_1_4;
  int<lower=1> NC_1;  // number of group-level correlations
  int prior_only;  // should the likelihood be ignored?
}
transformed data {
  int Kc = K - 1;
  matrix[N, Kc] Xc;  // centered version of X without an intercept
  vector[Kc] means_X;  // column means of X before centering
  for (i in 2:K) {
    means_X[i - 1] = mean(X[, i]);
    Xc[, i - 1] = X[, i] - means_X[i - 1];
  }
}
parameters {
  vector[Kc] b;  // population-level effects
  real Intercept;  // temporary intercept for centered predictors
  vector[Ks] bs;  // spline coefficients
  // parameters for spline s(fai)
  // standarized spline coefficients
  vector[knots_1[1]] zs_1_1;
  real<lower=0> sds_1_1;  // standard deviations of spline coefficients
  // parameters for spline s(prev_day_total_kcal_using_ap_low_fermentation)
  // standarized spline coefficients
  vector[knots_2[1]] zs_2_1;
  real<lower=0> sds_2_1;  // standard deviations of spline coefficients
  // parameters for spline s(isotope_time_collected)
  // standarized spline coefficients
  vector[knots_3[1]] zs_3_1;
  real<lower=0> sds_3_1;  // standard deviations of spline coefficients
  real<lower=0> sigma;  // dispersion parameter
  vector<lower=0>[M_1] sd_1;  // group-level standard deviations
  matrix[M_1, N_1] z_1;  // standardized group-level effects
  cholesky_factor_corr[M_1] L_1;  // cholesky factor of correlation matrix
}
transformed parameters {
  // actual spline coefficients
  vector[knots_1[1]] s_1_1;
  // actual spline coefficients
  vector[knots_2[1]] s_2_1;
  // actual spline coefficients
  vector[knots_3[1]] s_3_1;
  matrix[N_1, M_1] r_1;  // actual group-level effects
  // using vectors speeds up indexing in loops
  vector[N_1] r_1_1;
  vector[N_1] r_1_2;
  vector[N_1] r_1_3;
  vector[N_1] r_1_4;
  // compute actual spline coefficients
  s_1_1 = sds_1_1 * zs_1_1;
  // compute actual spline coefficients
  s_2_1 = sds_2_1 * zs_2_1;
  // compute actual spline coefficients
  s_3_1 = sds_3_1 * zs_3_1;
  // compute actual group-level effects
  r_1 = scale_r_cor(z_1, sd_1, L_1);
  r_1_1 = r_1[, 1];
  r_1_2 = r_1[, 2];
  r_1_3 = r_1[, 3];
  r_1_4 = r_1[, 4];
}
model {
  // likelihood including constants
  if (!prior_only) {
    // initialize linear predictor term
    vector[N] mu = Intercept + rep_vector(0.0, N) + Xs * bs + Zs_1_1 * s_1_1 + Zs_2_1 * s_2_1 + Zs_3_1 * s_3_1;
    for (n in 1:N) {
      // add more terms to the linear predictor
      mu[n] += r_1_1[J_1[n]] * Z_1_1[n] + r_1_2[J_1[n]] * Z_1_2[n] + r_1_3[J_1[n]] * Z_1_3[n] + r_1_4[J_1[n]] * Z_1_4[n];
    }
    target += normal_id_glm_lpdf(Y | Xc, mu, b, sigma);
  }
  // priors including constants
  target += normal_lpdf(b | 0, 10);
  target += normal_lpdf(Intercept | 0, 10);
  target += normal_lpdf(bs | 0, 10)
    - 3 * normal_lccdf(0 | 0, 10);
  target += student_t_lpdf(sds_1_1 | 3, 0, 2.5)
    - 1 * student_t_lccdf(0 | 3, 0, 2.5);
  target += std_normal_lpdf(zs_1_1);
  target += student_t_lpdf(sds_2_1 | 3, 0, 2.5)
    - 1 * student_t_lccdf(0 | 3, 0, 2.5);
  target += std_normal_lpdf(zs_2_1);
  target += student_t_lpdf(sds_3_1 | 3, 0, 2.5)
    - 1 * student_t_lccdf(0 | 3, 0, 2.5);
  target += std_normal_lpdf(zs_3_1);
  target += normal_lpdf(sigma | 0, 10)
    - 1 * normal_lccdf(0 | 0, 10);
  target += normal_lpdf(sd_1 | 0, 10)
    - 4 * normal_lccdf(0 | 0, 10);
  target += std_normal_lpdf(to_vector(z_1));
  target += lkj_corr_cholesky_lpdf(L_1 | 1);
}
generated quantities {
  // actual population-level intercept
  real b_Intercept = Intercept - dot_product(means_X, b);
  // compute group-level correlations
  corr_matrix[M_1] Cor_1 = multiply_lower_tri_self_transpose(L_1);
  vector<lower=-1,upper=1>[NC_1] cor_1;
  // extract upper diagonal of correlation matrix
  for (k in 1:M_1) {
    for (j in 1:(k - 1)) {
      cor_1[choose(k - 1, 2) + j] = Cor_1[j, k];
    }
  }
}

And this is my attempt to notate the model:

{\delta^{15}N\ }_{IJ} \sim Normal\left({mu}_{ij},sigma\right)\\ {mu}_{ij}=\alpha_0+\alpha_{ji}+s\left({Nutrient}_i,\ \tau\ \right)+\beta_{1j}{Nutrient}_i+s\left({FAI}_i,\ \tau\right)+\beta_{2j}{FAI}_i+\\s\left({sampletime\ }_i,\ \tau\right)+β3_jsampletime_i+\boldsymbol{\beta}AgeClass_i\\ \boldsymbol{\beta}{AgeClass}_i=\sum_{k=1}^{K}{\boldsymbol{\beta}_{ki}{AgeClass}_{Ki}}\\ \begin{bmatrix} \alpha_j\\β1j\\β2j\\β3j\end{bmatrix} \sim MVNormal\begin{pmatrix}\begin{bmatrix} 0\\0\\0\\0\end{bmatrix} ,S\end{pmatrix},for\ individual\ j=1,\dots J\\ S=\left(\begin{matrix}\sigma_{\alpha_j}^2&\rho_{\beta_0\beta_{1j}}&\rho_{\beta_0\beta_{2j}}&\rho_{\beta_0\beta_{3j}}\\\rho_{\beta_{1j}\alpha_j}&\sigma_{\beta_1}^2&\rho_{\beta_{1j}\beta_{2j}}&\rho_{\beta_{1j}\beta_{3j}}\\\rho_{\beta_{2j}\alpha_j}&\rho_{\beta_{2j}\beta_{1j}}&\sigma_{\beta_2}^2&\rho_{\beta_{2j}\beta_{3j}}\\\rho_{\beta_{3j}\alpha_j}&\rho_{\beta_{3j}\beta_{1j}}&\rho_{\beta_{3j}\beta_{2j}}&\sigma_{\beta_3}^2\\\end{matrix}\right)\\ \alpha_0,\alpha_j~Normal(0,10)\\ \beta_{1j},\beta_{2j},\beta_{3j},\boldsymbol{\beta} \sim Normal(0,10)\\ \sigma_{\alpha_j},\sigma_{\beta_{1j}},\sigma_{\beta_{2j}},\sigma_{\beta_{3j}},\sigma_{\boldsymbol{\beta}_j} \sim Normal(0,10)\\ \tau \sim StudentT(3,0,2.5)\\ sigma \sim Normal(0,10)\\ \boldsymbol{R} \sim {LKJ}_{corr}\left(1\right)

My questions are

Did I manage to get this right
and
Is this a sufficient minimum level of detail?

Any feedback will be much appreciated!

sea83 · March 31, 2023, 10:28am

Seems like not only was I quite off, but there was an easier-to-read way to notate this. This is notation generated by ChatGPT based on the brms and stan code. I think it makes sense to me, but if anyone can confirm that seems fine that would be very helpful!

Let\ \textit{i} \ index \ the \ observations, \\ \textit{j} \ index \ the \ groups \ (name\_focal), \\ and \ \textit{k} \ index \ the \ spline \ basis \ functions. \\ The \ model \ can \ be \ written \ as: \\ \begin{align} y_i &\sim \text{Normal}(\mu_i, \sigma) \\ \mu_i &= \alpha + X_{i}\boldsymbol{b} + Xs_{i}\boldsymbol{b_s} + Zs_{1_{i}}\boldsymbol{s_{1}} + Zs_{2_{i}}\boldsymbol{s_{2}} + Zs_{3_{i}}\boldsymbol{s_{3}} + \sum_{m=1}^{M} r_{m_{j}} Z_{m_{i}} \\ \boldsymbol{b} &\sim \text{Normal}(0, 10) \\ \alpha &\sim \text{Normal}(0, 10) \\ \boldsymbol{b_s} &\sim \text{Normal}(0, 10) \\ \boldsymbol{s_{1}} &= sds_{1_{1}} \cdot \boldsymbol{z_{s_{1}}} \\ \boldsymbol{s_{2}} &= sds_{1_{2}} \cdot \boldsymbol{z_{s_{2}}} \\ \boldsymbol{s_{3}} &= sds_{1_{3}} \cdot \boldsymbol{z_{s_{3}}} \\ sds_{1_{1}} &\sim \text{Student-t}(3, 0, 2.5) \\ sds_{1_{2}} &\sim \text{Student-t}(3, 0, 2.5) \\ sds_{1_{3}} &\sim \text{Student-t}(3, 0, 2.5) \\ \boldsymbol{z_{s_{1}}} &\sim \text{Normal}(0, 1) \\ \boldsymbol{z_{s_{2}}} &\sim \text{Normal}(0, 1) \\ \boldsymbol{z_{s_{3}}} &\sim \text{Normal}(0, 1) \\ \sigma &\sim \text{Normal}(0, 10) \\ \boldsymbol{r}_{j} &\sim \text{Multivariate Normal}(\boldsymbol{0}, \boldsymbol{\Sigma}) \\ \boldsymbol{\Sigma} &= \text{diag}(\boldsymbol{sd}) \cdot \boldsymbol{R} \cdot \text{diag}(\boldsymbol{sd}) \\ \boldsymbol{sd} &\sim \text{Normal}(0, 10) \\ \boldsymbol{R} &\sim \text{LKJ}(1) \\ \end{align} \\ where \ \alpha \ is \ the \ global \ intercept, \ \textbf{b} \ are \ the \ population-level \ effects, \ \\textbf{b_s} \ are \ the \ spline \ coefficients \ for \ the \ smooth \ functions, \\ \textbf{s_{1}}, \ \textbf{s_{2}}, \ and \ \textbf{s_{3}} \ are \ the \ actual \ spline coefficients \ for \ fai, \\ prev\_day\_total\_kcal\_using\_ap\_low\_fermentation, \\ and \ isotope\_time\_collected, \ \textbf{r}_{j} \ are \ the \ actual \ group-level \ effects, \\ \textbf{Sigma} \ is \ the \ group-level \ covariance \ matrix, \\ \textbf{sd} \ are \ the \ group-level \ standard \ deviations, \ and \\ \textbf{R} \ is \ the \ group-level \ z \ correlation \ matrix.

Topic		Replies	Views
Mathematical notation for a brms model brms notation	8	1653	February 19, 2021
Help with formula notation for multivariate model in brms vignette brms	1	923	February 20, 2020
Mathemathical notation for a brms model with a smooth term brms notation	8	1196	July 25, 2023
Bayesian regression model equation formulation brms	11	1903	February 25, 2020
Understanding the Mathematical Model brms	10	2821	July 27, 2018

Help with notation for hierarchical gam in BRMS

Related topics