I would like to model residual correlations in a multivariate logistic regression (discussed here). brms offers the option to do this for gaussian models only, which leaves me with no option other than to crack open the black box and try to fiddle around with the wires.
It has been a long time since I coded manually in stan and my skills have atrophied. Also even when I was at my best the ninja stuff going on under the hood in brms is mostly above my head. So my approach has been to (1) use the brms stancode() and make_standata() functions to extract the stan code and data list from a gaussian brm model with setrescor(TRUE) and compare it to a gaussian model with setrescor(FALSE), try to see what the differences are, and attempt to transpose that to my logistic model. Far from ideal I know but needs must.
And of course it didn’t work. I know going in that there are things that simply do not translate from gaussian models to logit models, for example the sigma() parameter. But frankly I need the community’s help with this one.
I have real data for a very simple model, estimating the association between two binary outcomes, binary1 and binary2 and a single binary predictor `relstatus’. Here are the data
exDF.RData (9.6 KB)
And here is the model in brms
# create multivariate formula, regressing the two outcomes on the single predictor
bform_mvr <- bf(mvbind(binary1, binary2) ~ pred)
# run model in brms
mvrMod_bin <- brm(formula = bform_mvr,
family = bernoulli,
data = exDF,
seed = 1234)
Now what I have done is extracted the stan code and data list
mvrDat_bin <- make_standata(mvrMod_bin)
mvrCode_bin <- stancode(mvrMod_bin)
Adding two data that are necessary to model residual correlations to the datalist
mvrDat_bin$nrescor <- 1
mvrDat_bin$nresp <- 2
And then made alterations to the stan code in mvrCode_bin based on what I observed was added to the code in the gaussian version when setrescor() was changed from FALSE to TRUE. Here is the code. Now I should point out again that I know this code is wrong - it is for a numeric model but don’t know how to tweak it for the purposes of a logit model. I’ve spread it out a bit for readability and where I think the extra code should go I have added notes saying as much (mostly, the models section is almost all new so there was little point.
mvrCode_corr <- "
data {
// outcome 1
int<lower=1> N; // total number of observations
int<lower=1> N_binary1; // number of observations
array[N_binary1] int Y_binary1; // response variable
int<lower=1> K_binary1; // number of population-level effects
matrix[N_binary1, K_binary1] X_binary1; // population-level design matrix
int<lower=1> Kc_binary1; // number of population-level effects after centering
// outcome 2
int<lower=1> N_binary2; // number of observations
array[N_binary2] int Y_binary2; // response variable
int<lower=1> K_binary2; // number of population-level effects
matrix[N_binary2, K_binary2] X_binary2; // population-level design matrix
int<lower=1> Kc_binary2; // number of population-level effects after centering
// potential code for modelling resdiaul correlations
int<lower=1> nresp; // number of responses
int nrescor; // number of residual correlations
int prior_only; // should the likelihood be ignored?
}
transformed data {
// outcome 1
matrix[N_binary1, Kc_binary1] Xc_binary1; // centered version of X_binary1 without an intercept
vector[Kc_binary1] means_X_binary1; // column means of X_binary1 before centering
// outcome 2
matrix[N_binary2, Kc_binary2] Xc_binary2; // centered version of X_binary2 without an intercept
vector[Kc_binary2] means_X_binary2; // column means of X_binary2 before centering
// potential code for residual correlations
array[N] vector[nresp] Y; // response array
// outcome 1
for (i in 2:K_binary1) {
means_X_binary1[i - 1] = mean(X_binary1[, i]);
Xc_binary1[, i - 1] = X_binary1[, i] - means_X_binary1[i - 1];
}
// outcome 2
for (i in 2:K_binary2) {
means_X_binary2[i - 1] = mean(X_binary2[, i]);
Xc_binary2[, i - 1] = X_binary2[, i] - means_X_binary2[i - 1];
}
// potential code for resid correlations
for (n in 1:N) {
Y[n] = transpose([Y_binary1[n], Y_binary2[n]]);
}
}
parameters {
vector[Kc_binary1] b_binary1; // regression coefficients
real Intercept_binary1; // temporary intercept for centered predictors
vector[Kc_binary2] b_binary2; // regression coefficients
real Intercept_binary2; // temporary intercept for centered predictors
// potential code for residual correlations
cholesky_factor_corr[nresp] Lrescor; // parameters for multivariate linear models
}
transformed parameters {
// prior contributions to the log posterior
real lprior = 0;
// outcome 1
lprior += student_t_lpdf(Intercept_binary1 | 3, 0, 2.5);
// outcome 2
lprior += student_t_lpdf(Intercept_binary2 | 3, 0, 2.5);
// potential code for resid corr
lprior += lkj_corr_cholesky_lpdf(Lrescor | 1);
}
model {
// likelihood including constants
if (!prior_only) {
// initialise linear predictor terms, outcome 1
vector[N_binary1] mu_binary1 = rep_vector(0.0, N_binary1);
// initialise linear predictor terms, outcome 2
vector[N_binary2] mu_binary2 = rep_vector(0.0, N_binary2);
// multivariate predictor array
array[N] vector[nresp] Mu;
vector[nresp] sigma = transpose([sigma_binary1, sigma_binary2]);
// Cholesky factor of residual
matrix[nresp, nresp] LSigma = diag_pre_multiply(sigma, Lrescor);
mu_binary1 += Intercept_binary1 + Xc_binary1 * b_binary1;
mu_binary2 += Intercept_binary2 + Xc_binary2 * b_binary2;
// combine univariate parameters
for (n in 1:N) {
Mu[n] = transpose([mu_binary1[n], mu_binary2[n]]);
}
target += multi_normal_cholesky_lpdf(Y | Mu, LSigma);
}
// priors including constants
target += lprior;
}
generated quantities {
// actual population-level intercept
real b_binary1_Intercept = Intercept_binary1 - dot_product(means_X_binary1, b_binary1);
// actual population-level intercept
real b_binary2_Intercept = Intercept_binary2 - dot_product(means_X_binary2, b_binary2);
// residual correlations
corr_matrix[nresp] Rescor = multiply_lower_tri_self_transpose(Lrescor);
vector<lower=-1,upper=1>[nrescor] rescor;
// extract upper diagonal of correlation matrix
for (k in 1:nresp) {
for (j in 1:(k - 1)) {
rescor[choose(k - 1, 2) + j] = Rescor[j, k];
}
}
}
"
And unsurprisingly, when I went to run the model
mvrMod_stan_bin_corr <- stan(model_code = mvrCode_corr,
data = mvrDat_bin,
seed = 1234)
I got an error
Error in stanc(file = file, model_code = model_code, model_name = model_name, :
0
Semantic error in 'string', line 67, column 35 to column 42:
-------------------------------------------------
65: lprior += student_t_lpdf(Intercept_binary2 | 3, 0, 2.5);
66: // potential code for resid corr
67: lprior += lkj_corr_cholesky_lpdf(Lrescor | 1);
^
68: }
69: model {
-------------------------------------------------
Identifier 'Lrescor' not in scope. Did you mean 'nrescor'?
Can anyone help me tweak my model code to get those residual correlations estimates to work? More than happy to get schooled in any other elements along the way. I would love to understand what’s going on in this code, but I think that would take a whole semester of fulltime study of nothing else.