Hi
I want to model plant mass data that I recorded with a good idea of the measurement error on my data. Because plant mass can only be positive, I tried to model this with a lognormal error distribution in BRMS, making use of the “mi(sdy = [measurement error])” capabilities to incorporate measurement error. I include a toy example below. In this simulation, as is the case in my plant mass data, the Gaussian measurement error results in negative observations of mass.
#known measurement error of 0.5
m_err <- 0.5
#simulate lognormal real plant and observed plant mass
sim1_data <- tibble(n = 1:1000)%>%mutate(m_err = m_err,
real_mass = exp(rnorm(n(), -2, 1)),
obs_mass = rnorm(n(), real_mass, m_err))
bf_sim1 <- bf(obs_mass | mi(sdy = m_err) ~ 1, family="lognormal")
m_sim1 <- brm(data = sim1_data,
bf_sim1,
backend = "cmdstanr",
prior = c(
set_prior("normal(0,1)", class = "Intercept"),
set_prior("cauchy(0,1)", class = "sigma")),
iter = 500, warmup = 100)
Brms does not let me model this with negative observed outcomes: “Error: Family ‘lognormal’ requires response greater than 0.” However, when inspecting the stan code generated by BRMS using the make_stancode() function, it seems that it models the outcome variable as the result of a gaussian draw with mean the positive latent “real” plant mass with sd the given measurement error.
- Does this not mean that negative outcomes should be allowed. After all, negative outcomes have a non-zero probability under any guassian distribution no matter the mean.
- Is there a way to define the such a model that circumvents this restriction by using the latent variable imputation of BRMS? I tried fitting a multivariate formula with the formulas below and fixing the coefficients for mi(real_mass) and m_err to 1 so that the first formula models the measurement error process. This model has a very hard time converging.
bf_sim1_me <- bf(obs_mass ~ 0+mi(real_mass), sigma ~ 0 + m_err, family = 'normal')
bf_sim1_dwr <-bf(real_mass | mi() ~ 1, family = brmsfamily("lognormal", link_sigma = "identity"))
I am returning to the Stan/BRMS forums with a similar question to a previous post which did not yield a working solution (Model lognormal distributed real values with measurement error that yield negative observed values in brms). With the data set in question, I am always returning to this same problem. So, I thought to give it another shot.
Thank you for any help.