I am trying to define a non-linear model but I am not so sure on the form it should take. I have a dataset where items can belong to classes A or B, and let’s say one predictor X which determines to which class the item belongs to. However, I additionally know that some portion P of the items have no relation to X, and belong to classes A or B by accident. At the same time, I do not know how well X predicts class assignment for the non-randomly, so I cannot just look at the accuracy of the classifier to determine P.
Now, is it possible to define this model in brms (Stan)? I tried the model described here: https://cran.r-project.org/web/packages/brms/vignettes/brms_nonlinear.html (the last one) which is somewhat similar to what I expect to be happening, but when I run it on simulated data where I know the proportions of randomly assigned items, the estimates do not reflect these known proportions.
The data could look something like this (in this case there we assume x1 complete:
n <- 900 k <- 100 x1 <- inv_logit(rnorm(n+k, 0, 10)) class_t <- ifelse(head(x1, n) > 0.5, "a", "b") class_r <- ifelse(round(runif(k, 0, 1)) == 0, "a", "b") df_t <- tibble(class = c(class_t, class_r), x1 = x1)
I also tried something like this:
model <- brm(bf(class ~ round(b2) * inv_logit(b1) + round(1 - b2) * c , b2 ~ 1 , c ~ 1 , b1 ~ 1 + x1 , nl = TRUE) , data = df_t , family = bernoulli("identity") , prior = c(prior(normal(-4, 3), nlpar = "b1", lb = -10, ub = 0) , prior(beta(1, 1), nlpar = "b2", lb = 0, ub = 1) , prior(beta(1, 1), nlpar = "c", lb = 0, ub = 1) ) , iter = 4000, warmup = 1000, chains = 4, cores = 4 , seed = 123 , control = list(adapt_delta = 0.9) )
But that also does not seem to do what I want. Any help would be greatly appreciated.