Hello!
I am planning to compare two proportions (Test of proportions) using brms but I am having trouble selecting priors for my regression parameters. The categorical predictor is “type”, where it refers to two types of boxes: Protein and Mixed (Based on the type of food it has). Under Protein, successful events= 41 out of 88 trials, and under Mixed, successful events= 41 out of 90 trials.
I had carried out a frequentist “prop.test” earlier and found the estimate values under prop1 to be 0.46 and prop2 to be 0.45. Other values are: chi-squared = 7.83e-31, df = 1, p-value = 1 (I see a separation problem here.). My bayesian model looks like this:
indieat9 <- data.frame(
type = c("protein" , "mixed"),
yes = c(41, 41), total_trials= c(88, 90)
)
prior9 <- set_prior("normal(0,1)", class = "b")
pvsm9 <- brm(yes | trials(total_trials) ~ 0+type, indieat9, family = binomial(link="logit"),
save_all_pars = "TRUE", warmup = 10, iter = 5000, chains = 5, prior = prior9,
sample_prior = TRUE, inits = "random", cores = 4,
seed=123)
Apart from the normal(0,1) prior used above, I also used cauchy(0,2.5) as suggested by Gelman for logistic regression parameters and beta (1,1). I am trying to use weakly informative priors for the categorical predictor, “type”. But whenever I run “pp_check (pvsm9, nsamples = 100)” for posterior predictive check, the graph I get is bad (Shown in the attached picture). It all haywire and all over the map, so to speak.
My limited know-how tells me, that’s probably because of the priors (I could be wrong). So my questions are:
- Am I correct in diagnosing my problem as my fault in selecting priors?
- If yes, what sort of priors can be suggested? (I am pretty sure, my estimates for the parameters won’t exceed 3, which is why I had thought of normal (0,1) earlier).
Context for my experiment: I am looking if free-ranging dogs in a group eat more from the protein box vs from the mixed box. This is the first time such an experiment is being carried out in these dogs (Whose behaviour has been shown to be different from pet dogs). I had done an experiment with the same set up with these dogs but individually. So the difference between the earlier experiment and the current one is individual dogs vs dogs in groups and I want to see if dogs being in groups affect their eating choices in any matter. So I could use data from there to form my priors.
Context about me: I am fairly new to Bayesian statistics (I have been studying it for only a month and a half now), so my knowledge is pretty limited. I am a biology person with minimal math background but I am open to learning but math heavy jargon do tend to go over my head. This is my first question here, so apologies if I posted something wrong or incomplete here. Do let me know, in that case.
Thanks.