Hi everyone,
This is my first post here and I hope I am doing it right.
So, I am trying to model longitudinal patient data with a random effects model using B-splines. My dataset contains biomarker measurements for patients, where each patient had several measurements over a period of time. My starting point was this great blog post, and my current model looks like this:
data {
int<lower=0> N_rows; // number of rows
int<lower=0> N_patients; // number of patients
vector[N_rows] y; // outcome
vector[N_rows] t; // time
int<lower=1> id[N_rows]; // patient IDs
int num_basis;
matrix[num_basis, N_rows] B; // B-splines matrix
}
parameters {
row_vector[num_basis] a_raw;
real theta0; // intercept
vector[N_patients] eta0;
vector[N_patients] eta1;
real<lower=0> sigma0;
real<lower=0> sigma1;
real<lower=0> sigma;
real<lower=0> tau;
}
transformed parameters {
row_vector[num_basis] a; // spline coefficients
a = a_raw*tau;
}
model {
vector[N_rows] mu_n;
// Priors
a_raw ~ normal(0, 1);
theta0 ~ normal(0, 1);
sigma0 ~ normal(0, 1);
sigma1 ~ normal(0, 1);
eta0 ~ normal(0, sigma0);
eta1 ~ normal(0, sigma1);
tau ~ normal(0, 1);
sigma ~ normal(0, 2.5);
//Likelihood
for(n in 1:N_rows) {
mu_n[n] = theta0 + eta0[id[n]] + dot_product(a, col(B, n)) + eta1[id[n]] * t[n];
y[n] ~ normal(mu_n[n], sigma);
}
}
generated quantities {
vector[N_rows] predictions;
vector[N_rows] pred_mu;
for(n in 1:N_rows) {
pred_mu[n] = (theta0 + eta0[id[n]]) + dot_product(a, col(B, n)) + eta1[id[n]] * t[n];
predictions[n] = normal_rng(pred_mu[n], sigma);
}
}
Or, to try to put it into mathematical notation:
\mu_i = \theta_0 + \eta_{i0} + a^T B_{i:} + \eta_{i1} t_i + \epsilon_i,
y \sim \mathcal{N}(\mu_i, \sigma),
where B represents a natural cubic spline matrix of t.
In generated quantities I am able to get predictions for the train data, however, I am wondering how would I get predictions for new patients under the assumption that I already have some of their biomarker measurements available (for example: let’s say I have 5 measurements from first 2 years and I want to predict what happens next 2 years). Since the random effect parameters \eta are patient-specific I am having difficulties to understand how to generate those for completely new patients.
I know there are some libraries offering those functionalities (JMbayes
if I am not mistaken), however, I would like to understand how to do it myself.
Any help will be appreciated!