Dear all,
I’m trying to model a dataset that looks like this:
The data is structured into 5 series from two groups. The thick lines in the figure are just empirical smooths of the groups.
I have theoretical reasons to assume that the series from the control group are just random (autocorrelated) variations around some constant value, to be estimated. Let’s call it the reference value. The quantity of interest is the mean curve for the test group, which must be at or below the reference value.
Data attached into a R file that can be sourced: model_data.r (22.5 KB)
Indeed, I’d like to assume that the test curve is at the reference value by default (kind of null hypothesis) unless there is sufficient evidence to pull it towards smaller values.
The question is how to encode this in a Stan program.
For reference, I started with a brms
model with the following formula: value ~ 1 + s(location, by = x)
, where x
is a binary variable indicative of the test group. This gives the following posterior estimates:
What I need is to prevent the posterior distribution of the red curve to climb beyond the reference level. The probabilistic model is crucial, since my goal is to estimate the posterior probability of the target being below the (as opposed to exactly at) reference level.
My first try was to use a non-linear model with brms
as follows:
bf(
value ~ mu0 - (exp(trt)-1)*(trt>0),
mu0 ~ 1,
trt ~ 0 + s(location, by = x),
nl = TRUE
)
Essentially, using a spline as a latent curve that is collapsed to 0 all the negative values. The transformation exp(trt)-1
is not essential. It could have been simply trt
.
This kind of works. Here are the posterior estimates of the group curves:
Nevertheless, there are a couple of things with which I’m not completely satisfied:
- I’d like to shrink the test curve a bit more towards the reference value. But I don’t have any parameter that controls that.
- The uncertainty band around the curve looks so tight. I would have expected much more uncertainty. With these results I’m getting almost 100% probability of departure from the reference level at every location.
- Looks like it is over-smoothing a bit. But I just used the default basis dimension (8). I guess I can simply adjust that by increasing this number.
But what worries me more is what happens when I try to account for the autocorrelation at the series level, which makes me think that collapsing the spline to zero is perhaps not a great idea.
Here is my second model that adds another term srs
which fits a smooth curve to each series:
bf(
value ~ mu0 - (exp(trt)-1)*(trt>=0) + srs,
mu0 ~ 1,
trt ~ 0 + s(location, by = x),
srs ~ 0 + s(location, by = series, id = 1),
nl = TRUE
)
Side question: notice that I specified id=1
in order to use a single standard-deviation parameter for all the series. However, the model still estimates a specific parameter for each series. Not sure what I’m doing wrong here, but ultimately I could fix it in the stan code.
In any case, the results are quite unstable. I get high Rhats and the estimates depend a lot on the starting values, sometimes giving wild results like a crazy estimate for the treatment curve compensated with highly varying series. Also giving bi-modal posteriors for the treatment group (some probability mass at zero and the rest centred at some positive value).
Perhaps that will be fixed by collapsing the standard deviations into a single parameter. Yet, it makes me think that the multiplication by 0 in the linear predictor simply breaks the flow of information from the data to the segments of the treatment curve below zero. In these segments, the curve can do whatever it wants without any impact in the likelihood, which surely isn’t a good idea.
Sorry for the long post. But I’d love to hear your thoughts about this modelling problem.
Thanks in advance.
By the way, for priors and other details, find below the full call to brms
and the generated stan code for the second model.
priors <- c(
prior(normal(150, 100), nlpar = "mu0"),
prior(normal(0, 5), nlpar = "trt"),
prior(normal(0, 5), nlpar = "srs"),
prior(normal(0, 3), class = "sds", nlpar = "trt"),
prior(normal(0, 5), class = "sds", nlpar = "srs"),
prior(normal(0, 3), class = "sigma")
)
formula <- bf(
value ~ mu0 - (exp(trt)-1)*(trt>=0) + srs,
mu0 ~ 1,
trt ~ 0 + s(location, by = x),
srs ~ 0 + s(location, by = series, id = 1),
nl = TRUE
)
brm(
formula = formula,
data = model_data, # attached above
family = "gaussian",
prior = priors,
control = list(
adapt_delta = .999,
max_treedepth = 15
),
seed = 20210331,
cores = 4,
iter = 400
)
// generated with brms 2.13.0
functions {
}
data {
int<lower=1> N; // number of observations
vector[N] Y; // response variable
int<lower=1> K_mu0; // number of population-level effects
matrix[N, K_mu0] X_mu0; // population-level design matrix
// data for splines
int Ks_trt; // number of linear effects
matrix[N, Ks_trt] Xs_trt; // design matrix for the linear effects
// data for spline s(location,by=is_test)
int nb_trt_1; // number of bases
int knots_trt_1[nb_trt_1]; // number of knots
// basis function matrices
matrix[N, knots_trt_1[1]] Zs_trt_1_1;
// data for splines
int Ks_srs; // number of linear effects
matrix[N, Ks_srs] Xs_srs; // design matrix for the linear effects
// data for spline s(location,by=series,id=1)1
int nb_srs_1; // number of bases
int knots_srs_1[nb_srs_1]; // number of knots
// basis function matrices
matrix[N, knots_srs_1[1]] Zs_srs_1_1;
// data for spline s(location,by=series,id=1)6
int nb_srs_2; // number of bases
int knots_srs_2[nb_srs_2]; // number of knots
// basis function matrices
matrix[N, knots_srs_2[1]] Zs_srs_2_1;
// data for spline s(location,by=series,id=1)3
int nb_srs_3; // number of bases
int knots_srs_3[nb_srs_3]; // number of knots
// basis function matrices
matrix[N, knots_srs_3[1]] Zs_srs_3_1;
// data for spline s(location,by=series,id=1)4
int nb_srs_4; // number of bases
int knots_srs_4[nb_srs_4]; // number of knots
// basis function matrices
matrix[N, knots_srs_4[1]] Zs_srs_4_1;
// data for spline s(location,by=series,id=1)5
int nb_srs_5; // number of bases
int knots_srs_5[nb_srs_5]; // number of knots
// basis function matrices
matrix[N, knots_srs_5[1]] Zs_srs_5_1;
int prior_only; // should the likelihood be ignored?
}
transformed data {
}
parameters {
vector[K_mu0] b_mu0; // population-level effects
vector[Ks_trt] bs_trt; // spline coefficients
// parameters for spline s(location,by=is_test)
// standarized spline coefficients
vector[knots_trt_1[1]] zs_trt_1_1;
real<lower=0> sds_trt_1_1; // standard deviations of spline coefficients
vector[Ks_srs] bs_srs; // spline coefficients
// parameters for spline s(location,by=series,id=1)1
// standarized spline coefficients
vector[knots_srs_1[1]] zs_srs_1_1;
real<lower=0> sds_srs_1_1; // standard deviations of spline coefficients
// parameters for spline s(location,by=series,id=1)6
// standarized spline coefficients
vector[knots_srs_2[1]] zs_srs_2_1;
real<lower=0> sds_srs_2_1; // standard deviations of spline coefficients
// parameters for spline s(location,by=series,id=1)3
// standarized spline coefficients
vector[knots_srs_3[1]] zs_srs_3_1;
real<lower=0> sds_srs_3_1; // standard deviations of spline coefficients
// parameters for spline s(location,by=series,id=1)4
// standarized spline coefficients
vector[knots_srs_4[1]] zs_srs_4_1;
real<lower=0> sds_srs_4_1; // standard deviations of spline coefficients
// parameters for spline s(location,by=series,id=1)5
// standarized spline coefficients
vector[knots_srs_5[1]] zs_srs_5_1;
real<lower=0> sds_srs_5_1; // standard deviations of spline coefficients
real<lower=0> sigma; // residual SD
}
transformed parameters {
// actual spline coefficients
vector[knots_trt_1[1]] s_trt_1_1;
// actual spline coefficients
vector[knots_srs_1[1]] s_srs_1_1;
// actual spline coefficients
vector[knots_srs_2[1]] s_srs_2_1;
// actual spline coefficients
vector[knots_srs_3[1]] s_srs_3_1;
// actual spline coefficients
vector[knots_srs_4[1]] s_srs_4_1;
// actual spline coefficients
vector[knots_srs_5[1]] s_srs_5_1;
// compute actual spline coefficients
s_trt_1_1 = sds_trt_1_1 * zs_trt_1_1;
// compute actual spline coefficients
s_srs_1_1 = sds_srs_1_1 * zs_srs_1_1;
// compute actual spline coefficients
s_srs_2_1 = sds_srs_2_1 * zs_srs_2_1;
// compute actual spline coefficients
s_srs_3_1 = sds_srs_3_1 * zs_srs_3_1;
// compute actual spline coefficients
s_srs_4_1 = sds_srs_4_1 * zs_srs_4_1;
// compute actual spline coefficients
s_srs_5_1 = sds_srs_5_1 * zs_srs_5_1;
}
model {
// initialize linear predictor term
vector[N] nlp_mu0 = X_mu0 * b_mu0;
// initialize linear predictor term
vector[N] nlp_trt = rep_vector(0, N) + Xs_trt * bs_trt + Zs_trt_1_1 * s_trt_1_1;
// initialize linear predictor term
vector[N] nlp_srs = rep_vector(0, N) + Xs_srs * bs_srs + Zs_srs_1_1 * s_srs_1_1 + Zs_srs_2_1 * s_srs_2_1 + Zs_srs_3_1 * s_srs_3_1 + Zs_srs_4_1 * s_srs_4_1 + Zs_srs_5_1 * s_srs_5_1;
// initialize non-linear predictor term
vector[N] mu;
for (n in 1:N) {
// compute non-linear predictor values
mu[n] = nlp_mu0[n] - (exp(nlp_trt[n]) - 1) * (nlp_trt[n] >= 0) + nlp_srs[n];
}
// priors including all constants
target += normal_lpdf(b_mu0 | 150, 100);
target += normal_lpdf(bs_trt | 0, 5)
- 2 * normal_lccdf(0 | 0, 5);
target += normal_lpdf(sds_trt_1_1 | 0, 3)
- 1 * normal_lccdf(0 | 0, 3);
target += std_normal_lpdf(zs_trt_1_1);
target += normal_lpdf(bs_srs | 0, 5)
- 5 * normal_lccdf(0 | 0, 5);
target += normal_lpdf(sds_srs_1_1 | 0, 5)
- 1 * normal_lccdf(0 | 0, 5);
target += std_normal_lpdf(zs_srs_1_1);
target += normal_lpdf(sds_srs_2_1 | 0, 5)
- 1 * normal_lccdf(0 | 0, 5);
target += std_normal_lpdf(zs_srs_2_1);
target += normal_lpdf(sds_srs_3_1 | 0, 5)
- 1 * normal_lccdf(0 | 0, 5);
target += std_normal_lpdf(zs_srs_3_1);
target += normal_lpdf(sds_srs_4_1 | 0, 5)
- 1 * normal_lccdf(0 | 0, 5);
target += std_normal_lpdf(zs_srs_4_1);
target += normal_lpdf(sds_srs_5_1 | 0, 5)
- 1 * normal_lccdf(0 | 0, 5);
target += std_normal_lpdf(zs_srs_5_1);
target += normal_lpdf(sigma | 0, 3)
- 1 * normal_lccdf(0 | 0, 3);
// likelihood including all constants
if (!prior_only) {
target += normal_lpdf(Y | mu, sigma);
}
}
generated quantities {
}