Assigning prior seperately

Hello,
I am very new in Bayesian and I find it really amazing. I have the same error and it brought me here.
I don’t know if I am allowed to ask my question here or I have to go and post it somewhere else.
I am using stan_glmer and I have 31 predictors, both numerical and categorical.

How do I assign priors individually? Let me elaborate.
Say, I have predictor A which is numerical and my prior is normal(.5,.1) and categorical variable B with this prior: B~ normal(location(.5,.5,.5), scale=c(.1,.1,.1)) and predictor C~normal(location=c(0,.2), scale=c(.1,.1))
Is it possible? Will I get an appropriate result? if I write it like this:
a.persisted.no=normal(-3, .3)
b.sex.male=normal(.5, .10)
c.Acad_Prog.art=normal(location = c(.5, .5, .5, .5,.5,.5,.5),
scale = c(.02,.02,.02,.02,.02,.02,.02))
my.prior= c(a.persisted.no, b.sex.male, c.Acad_Prog.art)

Bayes.00= stan_glmer(Retained~ Persisted + SEX+
(1|Acad_Prog)
,data = Full[which(Full$Status==“Train”),],
prior= my.prior, prior_intercept = normal(2, .3),
family = binomial(link = “logit”), chains = 1)

If I have a categorical variable with 6 levels, do I assign a multinomial distribution with 6 levels or 5 levels, as one of the levels is my reference. I assume 6.
I have read the help in rstanarm but I can’t find my answer.
Will you please help me?

The first thing almost works but not quite. You can do

my_prior_mean <- c(-3, rep(.5, times = 8))
my_prior_sd <- c(.3, .1, rep(.02, times = 7))
Bayes_00 <- stan_glmer(..., prior = normal(my_prior_mean, my_prior_sd))

Do you mean the categorical variable is a predictor or the outcome? If it is a predictor, R will make 5 dummy variables relative to the reference category but you should put a continuous prior on the 5 coefficients. It is an outcome, the rstanarm package does not currently support any multinomial model. You can do it with brm in the brms package or do it yourself in Stan code, in which case categorical_logit_lpmf takes a first argument whose levels can be 1 through 6.

1 Like

Oh thank you so much for replying so fast. I appreciate it.

@bgoodri
I have one predictor with a not normal distribution.
The image below:

Is it wrong if I assign a normal prior? Do you have any suggestions?
I do not know much about priors and what priors to assign.

Thank you

The marginal distribution of a predictor has no particular relationship to the prior distribution on the coefficient for that predictor. You are basically just specifying mu = alpha + beta * Persisted and the prior on beta reflects what you believe about the slope. Actually, in rstanarm if you use normal() for a prior with the default argument autoscale = TRUE, then it will sort of reinterpret your prior in terms of standard deviations, in which case the standard deviation of the predictor is relevant to the prior, but you can choose any prior distribution that you want. The normal distribution is fine in simple models, but for models with more predictors you want to look into a prior distribution that is more concentrated at zero, such as hs() and hs_plus(). See their documentation for explanations and references.

I really appreciate your responses. I studied it and your comment totally makes sense to me. I guess I have my result now.
Now I want to predict my future data and I am using posterior_linpred code to predict my future data set but I get this error:

Error: Invalid grouping factor specification,

do you know what is happening?

My guess is that the grouping variable in the future dataset is somehow different than in the present dataset, but I would need to see the syntax and output to be sure.

1 Like

Dear Ben,
Will you please look at my code below?

n=dim(Full[which(Full$Status==“Train”),])[1]
p=dim(Full[which(Full$Status==“Train”),])[2]
p0 <- 4 # prior guess for the number of relevant variables
tau0 <- p0/(p-p0) * 1/sqrt(n)
hs_prior <- hs(df=1, global_df=1, global_scale=tau0)
t_prior <- student_t(df = 7, location = 0, scale = 2.5)

Then,

model.1=stan_glmer(…
,data = Full[which(Full$Status==“Train”),],
prior = hs_prior, prior_intercept = t_prior,
seed = 0, adapt_delta = 0.999,
family = binomial(link = “logit”),chains = 3)

This is the model that I fit by your suggesting of hs() prior and my results based on my data makes sense but I am having a hard time explaining my choice of prior to people that I work with and also, I do not know how things exactly are working with regards of my choice of prior.

1:Why adapt_delta=.99 is the only model that worked for me and it converged?
2:Does P0 mean, the number of co_linear variables?
3: Can you explain a little bit about the global df or global scale? Is there a source I can read more about this so I can confidently talk about my model?

Best
Sima

We recommend posterior_vs_prior and / or other plots of what the prior implies (set prior_PD = TRUE to draw from the prior without conditioning on the data). Trying to explain what the prior is or does is mostly a lost cause for many audiences.

  1. The hierarchical shrinkage priors have heavy tails and so you have to take small steps in order to not diverge when you go from the mode toward the tail.
  2. I think P0 is intended to be the number of variables you expect to have a nonlinear effect
  3. It is @avehtari’s paper https://projecteuclid.org/euclid.ejs/1513306866
1 Like

Thank you. I am reading the paper.

Best
Sima