Model formulation of nested(?) model

torkar · October 29, 2020, 3:02pm

I’ve played around with this in my head now and I have a hard time grasping how to specify a model for this type of data.

Given data such as this,

id	measurement	value	trial	y
1	A	…	1	0
1	A	…	2	0
1	A	…	3	0
1	A	…	4	0
1	B	…	1	0
1	B	…	2	0
1	C	…	1	0
1	C	…	2	0
1	C	…	3	0
2	A	…	1	1

…

I would like to model the outcome y (1/0), i.e. have sickness or not.

id is a human subject, measurement is, e.g., blood pressure, pulse, etc., value depends on what measurement we use, and trial is simply in what temporal order the measurement, for measurement A,…,Z, was taken. The measurement can differ among subjects, as can the number of trials for each measurement.

First, I thought about (1 | measurement/trial/value) but that isn’t sane since I think we’d then use each unique \mathbb{R} value as a categorical value. Next, I thought that I’d use gp() and treat value as a varying intercept that way, but I don’t think it’ll fly since we’re talking about n>5e5.

@Guido_Biele or @paul.buerkner should know, but I’d appreciate anyone’s input! :)

Guido_Biele · October 29, 2020, 4:39pm

From the way you describe the data, it seems to me that having different effects from different measurements is more important than nestedness.

maybe this is to simple, but how about

y ~ value + ( 0 + value | measurement) + (1|ID)

the main idea here is to just have random slopes for measurement.
(In which case I would make sure that the measurements are on the same scale (same ID) and that the expected direction of the effect is the same for all measurements (flip if necessary) because shrinkage otherwise works against you.)

This proposal neglects the trial variable. One could of course just add + trial, but I guess it depends the specifics of the problem if it is as easy as that (e.g., if this is repeated measurement over time, I’d rather put in days or week from first measurement on, or from another reasonable starting date.).

If there is good evidence to think that the effect of measurement depends on time, one could do

y ~ value + ( 0 + value | measurement:time) + (1|ID)

torkar · October 30, 2020, 7:06am

Thanks Guido, and yes, I guess that is the most straightforward approach.

But, I can get timestamps for trial, so I guess I must try the second approach (which was actually more along the lines of what I wanted :)

Much appreciated!

Guido_Biele · October 30, 2020, 9:16am

I think it is still useful to try if time can be chunked into coarser bins to get a trial variable that does not have all to many levels.

Alternatively, one could add additional structure by regressing the effect of measurement on time (OK, its just an interaction ;-)), something like

y ~ value*time + (0 + value*time | measurement) + (1|ID)

Topic		Replies	Views
Model specification in brms Modeling brms	0	459	May 17, 2022
Hierarchically nested data in brms brms specification	3	624	May 5, 2020
Individuals nested within groups brms	2	678	August 20, 2021
Including hierarchical repeated measures specifying only higher condition in brms Modeling specification , mixed-model , brms	0	375	August 24, 2022
Trial types as random effects? General specification , brms	5	672	April 3, 2021

id	measurement	value	trial	y
1	A	…	1	0
1	A	…	2	0
1	A	…	3	0
1	A	…	4	0
1	B	…	1	0
1	B	…	2	0
1	C	…	1	0
1	C	…	2	0
1	C	…	3	0
2	A	…	1	1

id	measurement	value	trial	y
1	A	…	1	0
1	A	…	2	0
1	A	…	3	0
1	A	…	4	0
1	B	…	1	0
1	B	…	2	0
1	C	…	1	0
1	C	…	2	0
1	C	…	3	0
2	A	…	1	1

Model formulation of nested(?) model

Related topics

id	measurement	value	trial	y
1	A	…	1	0
1	A	…	2	0
1	A	…	3	0
1	A	…	4	0
1	B	…	1	0
1	B	…	2	0
1	C	…	1	0
1	C	…	2	0
1	C	…	3	0
2	A	…	1	1