I’m wondering if there is a way in brms to specify a measurement errors model with errors in both x and y where both x and y are observed values of the same underlying latent variable.
The practical application is for comparing two measurement instruments that measure the same thing, for example the concentration of a solute in water. The true concentration is unknown and we assume the measurement error of each instrument. Is there a way to infer the true concentration with a brms model, or would this need to be coded in Stan directly?
My current approach is similar to what is done here in Section 15.1.2
However, this treats x and y as measuring different variables,
Here is a simple reprex with simulated data:
library(data.table)
library(brms)
# Simulate data
make_data <- function(n, cvx, cvy, slope, intercept) {
tibble(
TrueX = runif(n, min = 5, max = 10),
SdX = cvx * TrueX,
SdY = cvy * TrueX,
ObsX = TrueX + rnorm(n, mean = 0, sd = SdX),
ObsY = intercept + slope * ObsX + rnorm(n, mean = 0, sd = SdY)
) |>
as.data.table()
}
set.seed(1)
n <- 100
cvx <- 0.05
cvy <- 0.05
slope <- 1.1
intercept <- 0.1
dt <- make_data(n, cvx, cvy, slope, intercept)
mdl_formula <- bf(
ObsY | mi(SdY) ~ me(ObsX, SdX)
)
# Assign assumed measurement error SDs
dt[, SdX := cvx * ObsX]
dt[, SdY := cvy * ObsY]
# --- Fit model ---
mdl_formula <- bf(
ObsY | mi(SdY) ~ me(ObsX, SdX)
)
priors_list <- c(
prior(normal(1, 0.5), class = 'b', coef = 'meObsXSdX'),
prior(normal(7.7, 3), class = 'meanme', coef = 'meObsX')
)
# Prior on 'meObsX' derived empirically from the data as mean(dt$ObsX) and sd(dt$ObsX) to hopefully stabilize the model/reduce divergences
start_time <- Sys.time()
mdl <- brm(
formula = mdl_formula,
data = dt,
prior = priors_list,
warmup = 4000,
iter = 12000,
thin = 2,
chains = 4,
cores = 4,
control = list(adapt_delta = 0.99, max_treedepth = 20)
)
Thanks!