First, I wonder if anyone could verify/correct my understanding of measurement error formulations.
I see the following formulation as representative of the classical formulation X_{measured} \sim N(X_{true}, \sigma_{measured})
whereas the following formulation is representative of the Berkson formulation. X_{true} \sim N(X_{measured}, \sigma_{measured})
If that definition is true, how do I go about defining a Berkson formulation using brms?
Using the me() function in brms, I get a translated Stan model with the following prior statement:
where Xme_1 and Xme_2 are the latent true variables that has its own hyperparameters that needs to be estimated by the model, Xn_1 and Xn_2 are the measured variables, and noise_1 and noise_2 are the standard deviation of the measured variables that I provided. Essentially, these prior statements match the “classical formulation” that I defined earlier.
I think they are mathematically equivalent, since the pdf for both cases is \frac{1}{\sigma_{measured} \sqrt{2 \pi}} exp\left({-\frac{1}{2}\left(\frac{X_{true}-X_{measured}}{\sigma_{measured}}\right)^2}\right), and (X_{true}-X_{measured})^2 = (X_{measured}-X_{true})^2. You give brms what you want to condition on, i.e. what information you have. If you have the true values, then you “estimate” X_{measured}, and vice versa.
Thanks Staffan! In my case, I only have the measured values. I guess my question should be more focused on the definition of X_{true} then. In the case of what I call classical formulation, X_{true} is a latent variable with hyperparameters that need to be solved too, which is what X_{true} \sim me(X_{measured}, \sigma_{measured}) would translate to.
But in what I define as the Berkson formulation, X_{true} is sampled from N \sim (X_{measured},\sigma_{measured}). So, while I agree the PDF is the same, that has to be conditioned on samples of X_{true} being the same for both cases, wouldn’t it? Which may not be the case…?