Hi,
I am trying to define a non-linear model but I am not so sure on the form it should take. I have a dataset where items can belong to classes A or B, and let’s say one predictor X which determines to which class the item belongs to. However, I additionally know that some portion P of the items have no relation to X, and belong to classes A or B by accident. At the same time, I do not know how well X predicts class assignment for the non-randomly, so I cannot just look at the accuracy of the classifier to determine P.
Now, is it possible to define this model in brms (Stan)? I tried the model described here: https://cran.r-project.org/web/packages/brms/vignettes/brms_nonlinear.html (the last one) which is somewhat similar to what I expect to be happening, but when I run it on simulated data where I know the proportions of randomly assigned items, the estimates do not reflect these known proportions.
The data could look something like this (in this case there we assume x1 complete:
n <- 900
k <- 100
x1 <- inv_logit(rnorm(n+k, 0, 10))
class_t <- ifelse(head(x1, n) > 0.5, "a", "b")
class_r <- ifelse(round(runif(k, 0, 1)) == 0, "a", "b")
df_t <- tibble(class = c(class_t, class_r),
x1 = x1)
I also tried something like this:
model <- brm(bf(class ~ round(b2) * inv_logit(b1) + round(1 - b2) * c
, b2 ~ 1
, c ~ 1
, b1 ~ 1 + x1
, nl = TRUE)
, data = df_t
, family = bernoulli("identity")
, prior = c(prior(normal(-4, 3), nlpar = "b1", lb = -10, ub = 0)
, prior(beta(1, 1), nlpar = "b2", lb = 0, ub = 1)
, prior(beta(1, 1), nlpar = "c", lb = 0, ub = 1)
)
, iter = 4000, warmup = 1000, chains = 4, cores = 4
, seed = 123
, control = list(adapt_delta = 0.9)
)
But that also does not seem to do what I want. Any help would be greatly appreciated.
Thanks!