Let’s say I have a binary outcome, numerous data rows per group for each of several timepoints, a predictor p_prop that is a proportion of samples samps that are positive. I have no information at the samps level, just the number taken for each value of p_prop in the row of data.
Differing numbers of samples are taken for each timepoint for each group. I am interested in how p_prop and timepoints predict the outcome.
I might run a model in brms like this:
m1 <- brm(outcome ~ 1 + timepoints + p_prop + (1|group), family=bernoulli, data=data)
However, because I have vastly differing numbers of samples for each timepoint for each group, I would think that I want some sort of measurement error model, because the p_prop value for 5 samples should have more error than a group with say 400 samples. Since I have the number of samples and the proportion, I could calculate the sd and use that in a measurement error model like this:
m1.me <- brm(outcome ~ 1 + timepoints + me(p_prop, p_prop_sd) + (1|group), family=bernoulli, data=data)
But I’m not sure this is what I want (and brms throws an error), because I have a lot of p_prop values that are 0 and 1, and the sd of a proportion that is 0 or 1 is 0. But it seems that I would want the uncertainty in the value of the predictor for p_prop to scale with the number of samples taken…right? How do I go about doing this? Maybe I’m not thinking about this the correct way, but despite the formula for sd of a proportion, at least in my scenario, I trust the value from 5 samples much less than I do from 400 samples, despite the value of the proportion itself.
Thanks!