Hi there,
For my thesis I have to perform multiple imputation during model fitting, and compare the results to a model without multiple imputation during model fitting. Unfortunately, I cannot manage to linearly transform the mi() variable in the main outcome model. In short, I want to do the following:
- Estimate the missing Troponin_Second_Measure values in a ‘sub-model’
- Calculate (not estimate) the Troponin_Difference based on the Troponin_First_Measure (observed, this variable does not contain missing values) and Troponin_Second_Measure (observed + imputed) values.
- Estimate the main outcome model component
Since multiple imputation during model fitting is a ‘one-step approach’, all above mentioned steps have to be performed at once. Currently, the two models (with and without multiple imputation during model fit) in R look as follows:
model1 <-
quiet(
brm(
data = df,
formula = Cardiac_Outcome ~ log(Troponin_First_Measure) + I(Troponin_Second_Measure-Troponin_First_Measure) + Physiological_Platec + Angiogram + Gender + Physiological_Creatine + (Troponin_First_Measure|ECG_Ischaemia),
prior = c(set_prior("student_t(3,0,5)"),
set_prior("lkj(2)", class = "cor")),
family = categorical(refcat = "Normal"),
iter = 500, chains = 4, cores = 4,
control = list(adapt_delta = 0.96, max_treedepth = 12)))
&
model2 <-
quiet(
brm(
data = df,
formula = bf(Cardiac_Outcome ~ log(Troponin_First_Measure) + I(mi(Troponin_Second_Measure) - Troponin_First_Measure) + Physiological_Platec + Angiogram + Gender + Physiological_Creatine + (Troponin_First_Measure|ECG_Ischaemia)) +
bf(Troponin_Second_Measure| mi() ~ Troponin_First_Measure + Physiological_Creatine + Physiological_Platec + Gender) + set_rescor(FALSE),
prior = c(set_prior("student_t(3,0,5)"),
set_prior("lkj(2)", class = "cor")),
family = list(categorical(refcat = "Normal"),
gaussian()),
iter = 500, chains = 4, cores = 4,
control = list(adapt_delta = 0.96, max_treedepth = 12)))
Unfortunately, I get the following error when running the second model:
Error in stanc(file = file, model_code = model_code, model_name = model_name, : failed to parse Stan model ‘2f963b3c6056c1414f07e7c5ec2b7f85’ due to the above error
The subtraction
I(mi(Troponin_Second_Measure)-Troponin_First_Measure)
in the second model is not possible whereas the subtraction
I(Troponin_Second_Measure)-Troponin_First_Measure)
in the first model is possible. Hence, it is most likely that the error is caused by trying to perform a linear transformation on the mi() variable. Here it says that the formula doesn’t currently allow imputed variables to have non-linear effects in the main outcome model. However, is there any way of performing the subtraction mentioned above in either R or STAN? In other words, are linear transformations of the imputed variables possible?
Note: Calculating Troponin_Difference before fitting the model and imputing this variable leads to suboptimal results since fitting Troponin_Second_Measure in the ‘sub-model’ is more suitable because of the nature of the data. Namely, Troponin_Difference is harder to model based on domain knowledge/literature and it contains many zeroes.
Note 2: The I() function allows arithmetic operators in a formula in R.
Note 3: Unfortunately, I cannot share the data since it is confidential patient data. However, I have added the STAN code from a model where Troponin_Difference is calculated before fitting the model and then imputed (this model runs without any errors). See the R code and STAN code (see uploaded file) below. My guess is that in the model{} block something has to be adjusted in the lines under
// vector combining observed and missing responses
(see text highlighted in red). However, I’m not familiar with writing STAN code and I have no idea on how to proceed.
Thanks in advance!
Marty
model3 <-
quiet(
brm(
data = df,
formula = bf(Cardiac_Outcome ~ log(Troponin_First_Measure) + mi(Troponin_Difference) + Physiological_Platec + Angiogram + Gender + Physiological_Creatine + (Troponin_First_Measure|ECG_Ischaemia)) +
bf(Troponin_Difference| mi() ~ Troponin_First_Measure + Physiological_Creatine + Physiological_Platec + Gender) + set_rescor(FALSE),
prior = c(set_prior("student_t(3,0,5)"),
set_prior("lkj(2)", class = "cor")),
family = list(categorical(refcat = "Normal"),
gaussian()),
iter = 500, chains = 4, cores = 4,
control = list(adapt_delta = 0.96, max_treedepth = 12)))
model3.stan.txt (17.6 KB)