Joint model specification, simple question

I want to include longitudinal information from a secondary endpoint into a primary binary endpoint.
To do this, we have the following model

image

As suggested in “Bayesian Workflow” (Gelman et. al 2020) I have of course fitted the model on simulated data. All is fine, except for the fact that the model does NOT recover the effect \beta_{lmm} of the random intercept from the linear mixed model in the logistic model.

Any help is greatly appreciated.

Best Laus

data {
  // LONGITUDIANL MODEL
  int<lower=1> N; // total number of observations
  vector[N] Y; // response variable
  int<lower=1> K; // number of population-level effects
  matrix[N, K] X; // population-level design matrix
  // data for group-level effects of ID 1
  int<lower=1> N_1; // number of grouping levels
  int<lower=1> M_1; // number of coefficients per level
  array[N] int<lower=1> J_1; // grouping indicator per observation
  // group-level predictor values
  vector[N] Z_1_1;
  int prior_only; // should the likelihood be ignored?

  // LOGISTIC MODEL
  int<lower=1> N_prime; // total nubmer of observations. This is N_1
  array[N_prime] int Y_prime; // Outcome 0/1 of upcrossing or not. 
  int<lower=1> K_prime; // Number of population-level effects.
  matrix[N_prime,K_prime] X_prime; // model.matrix for primary outcome.
}
transformed data {
  // LONGITUDIANL MODEL
  int Kc = K - 1;
  matrix[N, Kc] Xc; // centered version of X without an intercept
  vector[Kc] means_X; // column means of X before centering
  for (i in 2 : K) {
    means_X[i - 1] = mean(X[ : , i]);
    Xc[ : , i - 1] = X[ : , i] - means_X[i - 1];
  }

  // LOGISTIC MODEL 
  int Kc_prime = K_prime - 1;
  int Kp_prime = K_prime + 1;
  matrix[N_prime, Kc_prime] Xc_prime; // centered version of X without an intercept
  vector[Kc_prime] means_X_prime; // column means of X before centering
  for (i in 2 : K_prime) {
    means_X_prime[i - 1] = mean(X_prime[ : , i]);
    Xc_prime[ : , i - 1] = X_prime[ : , i] - means_X_prime[i - 1];
  }

}
parameters {
   // LONGITUDINAL MODEL
  vector[Kc] b; // population-level effects
  real Intercept; // temporary intercept for centered predictors
  real<lower=0> sigma; // dispersion parameter
  vector<lower=0>[M_1] sd_1; // group-level standard deviations
  array[M_1] vector[N_1] z_1; // standardized group-level effects

   // LOGISTIC MODEL 
  vector[K_prime] b_prime; 
  // population-level effects, still without intercept, but now with the
  // parameter from the model.
  real Intercept_prime; // temporary intercept for centered predictors
}
transformed parameters {
  // LONGITUDINAL MODEL
  vector[N_1] r_1_1; // actual group-level effects
  real lprior = 0; // prior contributions to the log posterior
  r_1_1 = sd_1[1] * z_1[1];
  lprior += student_t_lpdf(Intercept | 3, 28.7, 6.1);
  lprior += student_t_lpdf(sigma | 3, 0, 6.1)
            - 1 * student_t_lccdf(0 | 3, 0, 6.1);
  lprior += student_t_lpdf(sd_1 | 3, 0, 6.1)
            - 1 * student_t_lccdf(0 | 3, 0, 6.1);

  // LOGISTIC MODEL
  lprior += student_t_lpdf(Intercept_prime | 3, 0, 2.5);
}
model {
  // LONGITUDIANL MODEL

  if (!prior_only) {
    // initialize linear predictor term
    vector[N] mu = rep_vector(0.0, N);
    mu += Intercept;
    for (n in 1 : N) {
      // add more terms to the linear predictor
      mu[n] += r_1_1[J_1[n]] * Z_1_1[n];
    }
    target += normal_id_glm_lpdf(Y | Xc, mu, b, sigma);
  }
  // priors including constants
  target += lprior;
  target += std_normal_lpdf(z_1[1]);

  // LOGISTIC MODEL

  //print(dims(b_prime));
  //print(dims(append_col(Xc_prime, r_1_1)));

  target += bernoulli_logit_glm_lpmf(Y_prime | append_col(Xc_prime, r_1_1), Intercept_prime, b_prime);
}
generated quantities {
  // actual population-level intercept
  real b_Intercept = Intercept - dot_product(means_X, b);
  vector[K_prime-1] b_prime_truncated;
  b_prime_truncated = b_prime[1:(K_prime-1)];
  real b_Intercept_prime = Intercept_prime - dot_product(means_X_prime,b_prime_truncated);
}

It looks like you may be combining your varying intercepts (actually varying offsets from the Intercept) from the longitudinal model into your model matrix for the logistic model…? I’m wondering if this shouldn’t be done, and maybe you should use bernoulli_logit_lpmf instead of bernoulli_logit_glm_lpmf, and write out your formula for the logistic model including the parameters from the longitudinal model with appropriate indexing for the group-level from the longitudinal model into the logistic model. Also, it seems that you are only including the varying offsets from the intercept in the longitudinal model in the logistic model, but it seems like you might want the actual intercept + offsets?

1 Like

Thanks. I will try to implement the full likelihood using bernoulli_logit_lpmf instead. I am fairly sure, that it only makes sense to carry forward the random intercepts, as they hold information regarding the individual BMI level.
It is also fairly easy to fit a two-step model where I just carry forward the EBLUP’s (predicted random effects) to the logistic model. However, this does not take the uncertainty of the random effects into account. That’s why I want to go with a bayesian model.