Hello!
I’m trying to model symptom trajectories in a dataset of people who have decided to discontinue psychiatric medication. There are 3-5 irregularly spaced observations per person. People discontinue medication at different times, but everyone discontinues eventually.
Exploratory plotting suggests that symptom trajectories (both before and after discontinuation) are often very non-linear, so a dummy variable capturing level change is probably not enough.
My first thought was to represent change over time using splines like this:
brm(symptoms ~ s(time, off_meds, bs = "fs", k = 10) + (1|clinic) + (1|patient))
However, my fit is difficult to take seriously.
Here’s the model fit for an imaginary person who stays “on meds” for the duration of the study alongside an imaginary person who stays “off meds” for the duration:
There are (by definition) no observations for anyone off meds at admission, and yet the model is very confident (these are 90 percent credible intervals) that a person off meds at baseline has lower symptoms than a person on meds at baseline. That seems wrong.
Perhaps the solution is to ignore predictions for people off meds during the first few months of treatment, just as one might ignore model predictions for human heights over 10 feet? So I plotted out what the same model implies about a person who stays on meds until 7 months and then goes off medication, comparing that to a counterfactual person who stays on meds for the duration of the study, like this:
One interpretation of this second graph is that people going off meds tend to get initially worse but then gradually bring themselves under control.
Another interpretation is that the model is fitting separate curves for the “on meds” and “off meds” data points, has zero observations of people who are “off meds” at the beginning of the study, and very few observations of people who are “on meds” at the end of the study, and so the whole fitting process is getting thrown off.
I’m curious if others have been in a similar situation before, and if there’s a way to better model this process. I’m worried the solution is some kind of weird discontinuous spline, e.g. letting one spline fit to all the data up until medication discontinuation, and letting the other spline fit to all the data starting from medication discontinuation. I suppose I could hard-code this in Stan, but am hoping there’s some solution within a package like brms
or rstanarm
.
I’d appreciate any insight people may have.