# Joint model specification, simple question

I want to include longitudinal information from a secondary endpoint into a primary binary endpoint.
To do this, we have the following model

As suggested in “Bayesian Workflow” (Gelman et. al 2020) I have of course fitted the model on simulated data. All is fine, except for the fact that the model does NOT recover the effect \beta_{lmm} of the random intercept from the linear mixed model in the logistic model.

Any help is greatly appreciated.

Best Laus

data {
// LONGITUDIANL MODEL
int<lower=1> N; // total number of observations
vector[N] Y; // response variable
int<lower=1> K; // number of population-level effects
matrix[N, K] X; // population-level design matrix
// data for group-level effects of ID 1
int<lower=1> N_1; // number of grouping levels
int<lower=1> M_1; // number of coefficients per level
array[N] int<lower=1> J_1; // grouping indicator per observation
// group-level predictor values
vector[N] Z_1_1;
int prior_only; // should the likelihood be ignored?

// LOGISTIC MODEL
int<lower=1> N_prime; // total nubmer of observations. This is N_1
array[N_prime] int Y_prime; // Outcome 0/1 of upcrossing or not.
int<lower=1> K_prime; // Number of population-level effects.
matrix[N_prime,K_prime] X_prime; // model.matrix for primary outcome.
}
transformed data {
// LONGITUDIANL MODEL
int Kc = K - 1;
matrix[N, Kc] Xc; // centered version of X without an intercept
vector[Kc] means_X; // column means of X before centering
for (i in 2 : K) {
means_X[i - 1] = mean(X[ : , i]);
Xc[ : , i - 1] = X[ : , i] - means_X[i - 1];
}

// LOGISTIC MODEL
int Kc_prime = K_prime - 1;
int Kp_prime = K_prime + 1;
matrix[N_prime, Kc_prime] Xc_prime; // centered version of X without an intercept
vector[Kc_prime] means_X_prime; // column means of X before centering
for (i in 2 : K_prime) {
means_X_prime[i - 1] = mean(X_prime[ : , i]);
Xc_prime[ : , i - 1] = X_prime[ : , i] - means_X_prime[i - 1];
}

}
parameters {
// LONGITUDINAL MODEL
vector[Kc] b; // population-level effects
real Intercept; // temporary intercept for centered predictors
real<lower=0> sigma; // dispersion parameter
vector<lower=0>[M_1] sd_1; // group-level standard deviations
array[M_1] vector[N_1] z_1; // standardized group-level effects

// LOGISTIC MODEL
vector[K_prime] b_prime;
// population-level effects, still without intercept, but now with the
// parameter from the model.
real Intercept_prime; // temporary intercept for centered predictors
}
transformed parameters {
// LONGITUDINAL MODEL
vector[N_1] r_1_1; // actual group-level effects
real lprior = 0; // prior contributions to the log posterior
r_1_1 = sd_1[1] * z_1[1];
lprior += student_t_lpdf(Intercept | 3, 28.7, 6.1);
lprior += student_t_lpdf(sigma | 3, 0, 6.1)
- 1 * student_t_lccdf(0 | 3, 0, 6.1);
lprior += student_t_lpdf(sd_1 | 3, 0, 6.1)
- 1 * student_t_lccdf(0 | 3, 0, 6.1);

// LOGISTIC MODEL
lprior += student_t_lpdf(Intercept_prime | 3, 0, 2.5);
}
model {
// LONGITUDIANL MODEL

if (!prior_only) {
// initialize linear predictor term
vector[N] mu = rep_vector(0.0, N);
mu += Intercept;
for (n in 1 : N) {
// add more terms to the linear predictor
mu[n] += r_1_1[J_1[n]] * Z_1_1[n];
}
target += normal_id_glm_lpdf(Y | Xc, mu, b, sigma);
}
// priors including constants
target += lprior;
target += std_normal_lpdf(z_1[1]);

// LOGISTIC MODEL

//print(dims(b_prime));
//print(dims(append_col(Xc_prime, r_1_1)));

target += bernoulli_logit_glm_lpmf(Y_prime | append_col(Xc_prime, r_1_1), Intercept_prime, b_prime);
}
generated quantities {
// actual population-level intercept
real b_Intercept = Intercept - dot_product(means_X, b);
vector[K_prime-1] b_prime_truncated;
b_prime_truncated = b_prime[1:(K_prime-1)];
real b_Intercept_prime = Intercept_prime - dot_product(means_X_prime,b_prime_truncated);
}


It looks like you may be combining your varying intercepts (actually varying offsets from the Intercept) from the longitudinal model into your model matrix for the logistic model…? I’m wondering if this shouldn’t be done, and maybe you should use bernoulli_logit_lpmf instead of bernoulli_logit_glm_lpmf, and write out your formula for the logistic model including the parameters from the longitudinal model with appropriate indexing for the group-level from the longitudinal model into the logistic model. Also, it seems that you are only including the varying offsets from the intercept in the longitudinal model in the logistic model, but it seems like you might want the actual intercept + offsets?

1 Like

Thanks. I will try to implement the full likelihood using bernoulli_logit_lpmf instead. I am fairly sure, that it only makes sense to carry forward the random intercepts, as they hold information regarding the individual BMI level.
It is also fairly easy to fit a two-step model where I just carry forward the EBLUP’s (predicted random effects) to the logistic model. However, this does not take the uncertainty of the random effects into account. That’s why I want to go with a bayesian model.