I am working with data where around 30% of observations are measured incorrectly and are 2x larger than they should have been recorded. I want to use a mixture model to identify these observations to recover their actual value.
# simulate data # (MacOS 11.1 and brms 2.14.4) library(tidyverse) set.seed(1234) dat <- tibble(x = 1 + rbinom(300, 1, 0.69)) %>% mutate(y1 = x*rnorm(300, 2.5), y2 = x*rnorm(300, 5.5) )
x is a hidden variable, what I have so far is:
# fit and check brms model library(brms) options(mc.cores = parallel::detectCores()) prior <- c( prior(normal(2, 1), class = Intercept, dpar = mu1, resp = y1), prior(normal(4, 2), class = Intercept, dpar = mu2, resp = y1), prior(normal(5, 1), class = Intercept, dpar = mu1, resp = y2), prior(normal(10, 2), class = Intercept, dpar = mu2, resp = y2) ) fit1 <- brm(mvbind(y1, y2) ~ 1, dat, family = mixture(gaussian, nmix = 2), prior = prior, chains = 8) summary(fit1) pp_check(fit1, resp = "y1") pp_check(fit1, resp = "y2")
From this model, I can get an estimate of each observation being in the incorrectly coded group with
pp_mixture on both
Currently the model creates two independent parameters for the mixture weights
theta2. However, whilst I don’t know the exact mixture weight, I know that the value of
theta2 are the same. I would like to either equate the two (
theta1 = theta2) or model only one
theta parameter. How can I set this in the model?