Specifying Priors for Ordinal (adjacent category) Model

torkar · May 12, 2020, 2:28pm

Does this perhaps happen for some type of priors? If yes that’s an indication of prior combinations that are not good.

JLC · May 12, 2020, 2:30pm

It did, indeed.

I did find a workaround by calling the bayseplot function directly and omitting NAs

ppc_bars(y = Emo_Data$Valence,
                 yrep = posterior_predict(Valence_BRMS_acat_prior_Mod,
                                          nsamples = 100) %>%
                         na.omit()) +
        scale_x_continuous(limits = c(0,10), breaks = seq(0,9,1)) +
        theme_bw() +
        theme(panel.grid.major = element_blank(),
              panel.grid.minor = element_blank(),
              axis.line = element_line(colour = "black")
              )

torkar · May 12, 2020, 2:32pm

Ah, but be careful. The NAs could indicate that your priors are too broad!

torkar · May 12, 2020, 2:33pm

See this post where Aki explains is nicely:

JLC · May 12, 2020, 3:14pm

Thanks!

I did read that post when searching for possible approaches. I’m just filtering the NAs as a hack to try to see what, if any, changes result from modifying the priors.

It seems no matter what I do, there’s a horseshoe:

priors_acat <- c(
        # set_prior("student_t(3, 0, 2.5)",
        #           class = "b"),
        
        set_prior("normal(0,1)",
                  class = "b",
                  coef = "condition80"),

        set_prior("normal(0,1)",
                  class = "b",
                  coef = "conditionCP"),

        set_prior("normal(0,1)",
                  class = "b",
                  coef = "conditionNFC"),

        set_prior("normal(0,1)",
                  class = "b",
                  coef = "groupHI"),

        set_prior("normal(0,1)",
                  class = "b",
                  coef = "new_categoryunpleasant_low"),

        set_prior("normal(0,1)",
                  class = "b",
                  coef = "new_categoryneutral"),

        set_prior("normal(0,1)",
                  class = "b",
                  coef = "new_categorypleasant_low"),

        set_prior("normal(0,1)",
                  class = "b",
                  coef = "new_categorypleasant_high"),
        
        set_prior("normal(-2,0.05)",  
                  class = "Intercept",
                  coef = "1"),
        
        set_prior("normal(0,0.1)",  
                  class = "Intercept",
                  coef = "2"),
        
        set_prior("normal(0,0.1)",  
                  class = "Intercept",
                  coef = "3"),
        
        set_prior("normal(0,0.1)",  
                  class = "Intercept",
                  coef = "4"),
        
        set_prior("normal(0,0.1)",  
                  class = "Intercept",
                  coef = "5"),
        
        set_prior("normal(0,0.1)",  
                  class = "Intercept",
                  coef = "6"),
        
        set_prior("normal(0,0.1)",  
                  class = "Intercept",
                  coef = "7"),
        
        set_prior("normal(2, 0.05)",  
                  class = "Intercept",
                  coef = "8"),
        
        set_prior("cauchy(0,1)",  
                  class = "sd",
                  group = "participant",
                  coef = "Intercept"),
        
        set_prior("cauchy(0,1)",  
                  class = "sd",
                  group = "group:condition",
                  coef = "Intercept")
        
)

Gives this:

And a large change (increasing the location from 0 to 5 for intercepts 4,5,and 6), still gives the same shape on predictions:

priors_acat <- c(
        set_prior("normal(0,1)",
                  class = "b",
                  coef = "condition80"),

        set_prior("normal(0,1)",
                  class = "b",
                  coef = "conditionCP"),

        set_prior("normal(0,1)",
                  class = "b",
                  coef = "conditionNFC"),

        set_prior("normal(0,1)",
                  class = "b",
                  coef = "groupHI"),

        set_prior("normal(0,1)",
                  class = "b",
                  coef = "new_categoryunpleasant_low"),

        set_prior("normal(0,1)",
                  class = "b",
                  coef = "new_categoryneutral"),

        set_prior("normal(0,1)",
                  class = "b",
                  coef = "new_categorypleasant_low"),

        set_prior("normal(0,1)",
                  class = "b",
                  coef = "new_categorypleasant_high"),
        
        set_prior("normal(2,2)",  
                  class = "Intercept",
                  coef = "1"),
        
        set_prior("normal(2,1)",  
                  class = "Intercept",
                  coef = "2"),
        
        set_prior("normal(4,1)",  
                  class = "Intercept",
                  coef = "3"),
        
        set_prior("normal(5,1)",  
                  class = "Intercept",
                  coef = "4"),
        
        set_prior("normal(5,1)",  
                  class = "Intercept",
                  coef = "5"),
        
        set_prior("normal(1,1)",  
                  class = "Intercept",
                  coef = "6"),
        
        set_prior("normal(1,1)",  
                  class = "Intercept",
                  coef = "7"),
        
        set_prior("normal(-2,2)",  
                  class = "Intercept",
                  coef = "8"),
        
        set_prior("cauchy(0,1)",  
                  class = "sd",
                  group = "participant",
                  coef = "Intercept"),
        
        set_prior("cauchy(0,1)",  
                  class = "sd",
                  group = "group:condition",
                  coef = "Intercept")        
)

torkar · May 12, 2020, 4:11pm

Hi

looking at your priors I have to say they look a bit strange. Some are negative some positive, some has a mean that differs a lot from others, etc. Additionally you seem to set a prior for each chef when very often one sets (if one doesn’t have prior information!) one type of prior for all, e.g., \beta, \alpha, etc.

Could you put up your model here, with data if possible so one can try this? Also, acat could be an option, but have you looked into cumulative?

JLC · May 12, 2020, 4:58pm

Thanks again!

I found that when setting a global prior for the intercepts, I ended up with the horseshoe pattern in predictions, so I started trying to specify each intercept directly.

The data are attached here:
Emotion_Data.csv (434.6 KB)

My model is specified as:

# List of needed packages
Pkgs <- c("tidyverse", "magrittr", "brms", 
          "strengejacke", "rstan", "bayesplot", "parallel", 
          "bayestestR", "tidybayes", "modelr")

# Load packages
lapply(Pkgs, require, c = T)

## Set computing options
ncores = detectCores()
rstan_options(auto_write = TRUE)
options(mc.cores = parallel::detectCores())

# Read in data
Emo_Data <- read.csv("Emotion_Data.csv") %>%
        mutate(group = factor(group, 
                              levels = c("NH", "HI")),
               condition = factor(condition, 
                                  levels = c("60", "80", "CP", "NFC")),
               new_category = factor(new_category, 
                                     levels = c("unpleasant_high", "unpleasant_low", 
                                                "neutral", "pleasant_low", "pleasant_high"))) %>%
        filter(!is.na(Valence))

# Model formula
Emo_Mod_fmla <- bf(Valence ~ group +
                           condition +
                           new_category +
                            (1|participant) +
                           (1|new_category/token) +
                           (1|group/condition))

# Set priors
priors_acat <- c(
        set_prior("normal(0,1)",
                  class = "b",
                  coef = "condition80"),

        set_prior("normal(0,1)",
                  class = "b",
                  coef = "conditionCP"),

        set_prior("normal(0,1)",
                  class = "b",
                  coef = "conditionNFC"),

        set_prior("normal(0,1)",
                  class = "b",
                  coef = "groupHI"),

        set_prior("normal(0,1)",
                  class = "b",
                  coef = "new_categoryunpleasant_low"),

        set_prior("normal(0,1)",
                  class = "b",
                  coef = "new_categoryneutral"),

        set_prior("normal(0,1)",
                  class = "b",
                  coef = "new_categorypleasant_low"),

        set_prior("normal(0,1)",
                  class = "b",
                  coef = "new_categorypleasant_high"),
        
        set_prior("normal(1, 0.01)",  
                  class = "Intercept",
                  coef = "1"),
        
        set_prior("normal(2,0.01)",  
                  class = "Intercept",
                  coef = "2"),
        
        set_prior("normal(3,0.01)",  
                  class = "Intercept",
                  coef = "3"),
        
        set_prior("normal(4,0.01)",  
                  class = "Intercept",
                  coef = "4"),
        
        set_prior("normal(5,0.01)",  
                  class = "Intercept",
                  coef = "5"),
        
        set_prior("normal(6,0.01)",  
                  class = "Intercept",
                  coef = "6"),
        
        set_prior("normal(7,0.01)",  
                  class = "Intercept",
                  coef = "7"),
        
        set_prior("normal(8,0.01)",  
                  class = "Intercept",
                  coef = "8"),
        
        set_prior("cauchy(0,1)",  
                  class = "sd",
                  group = "participant",
                  coef = "Intercept"),
        
        set_prior("cauchy(0,1)",  
                  class = "sd",
                  group = "group:condition",
                  coef = "Intercept"),
        
        set_prior("cauchy(0,1)",  
                  class = "sd",
                  group = "new_category:token",
                  coef = "Intercept")        
)


Valence_BRMS_acat_Mod <-
        brm(Emo_Mod_fmla,
            data = Emo_Data,
            family = "acat",
            prior = priors_acat,
            inits = 0,
            iter = 2000,
            warmup = 1000,
            chains = 2,
            cores = ncores,
            control = list(adapt_delta = 0.9,
                           max_treedepth = 12)
        )

With the priors for intercepts set fairly tightly to the response category boundaries, I get the following with a prior-only predictive check:

edit to note that the model takes a very long time to run!

abartonicek · May 13, 2020, 12:42am

As @torkar pointed out, you might be better off specifying the cumulative family as opposed to acat. The way you’ve set up your priors on the thresholds would make sense if they were partitioning a latent variable (i.e. what cumulative does). I’ve no idea what the priors mean in relation to the acat model, where afaik you’re modeling the relative probability between category k and k + 1.

Additionally, your priors for standard deviations are cauchy, and the thick tails might be causing problems. Thinking about it, that could be the reason why you get a lot of extreme observations via prior predictive checks - because your model might be putting a lot of density into some really weird areas of the parameter space. Try setting the priors to normal or student-t with 3+ degrees of freedom and see what happens.

JLC · May 13, 2020, 1:43am

Thanks to both you and @torkar

Perplexingly, I get the same results with narrower priors on the sd’s and the cumulative family:

# Set verbose priors
priors <- c(
        set_prior("normal(0,1)",
                  class = "b",
                  coef = "condition80"),

        set_prior("normal(0,1)",
                  class = "b",
                  coef = "conditionCP"),

        set_prior("normal(0,1)",
                  class = "b",
                  coef = "conditionNFC"),

        set_prior("normal(0,1)",
                  class = "b",
                  coef = "groupHI"),

        set_prior("normal(0,1)",
                  class = "b",
                  coef = "new_categoryunpleasant_low"),

        set_prior("normal(0,1)",
                  class = "b",
                  coef = "new_categoryneutral"),

        set_prior("normal(0,1)",
                  class = "b",
                  coef = "new_categorypleasant_low"),

        set_prior("normal(0,1)",
                  class = "b",
                  coef = "new_categorypleasant_high"),

        set_prior("normal(1, 0.1)",  
                  class = "Intercept",
                  coef = "1"),
        
        set_prior("normal(2,0.1)",  
                  class = "Intercept",
                  coef = "2"),
        
        set_prior("normal(3,0.1)",  
                  class = "Intercept",
                  coef = "3"),
        
        set_prior("normal(4,0.1)",  
                  class = "Intercept",
                  coef = "4"),
        
        set_prior("normal(5,0.1)",  
                  class = "Intercept",
                  coef = "5"),
        
        set_prior("normal(6,0.1)",  
                  class = "Intercept",
                  coef = "6"),
        
        set_prior("normal(7,0.1)",  
                  class = "Intercept",
                  coef = "7"),
        
        set_prior("normal(8,0.1)",  
                  class = "Intercept",
                  coef = "8"),
        
        set_prior("cauchy(1,1)",  
                  class = "sd",
                  group = "participant",
                  coef = "Intercept"),
        
        set_prior("cauchy(0,1)",  
                  class = "sd",
                  group = "group:condition",
                  coef = "Intercept")        
)

Cumulative_Mod <-
        brm(Emo_Mod_fmla,
            data = Emo_Data,
            family = cumulative(link = "logit", 
                                threshold = "flexible"),
            prior = priors
            inits = 0,
            iter = 1000,
            warmup = 500,
            chains = 2,
            cores = ncores,
            sample_prior = "only",
            control = list(adapt_delta = 0.8,
                           max_treedepth = 12)
        )

Cumulative_Narrow

I get the same response is I set global priors for the cumulative family model with:

priors_short <- c(
        set_prior("normal(0,2)",
                  class = "b"),
        
        set_prior("normal(0,10)",
                  class = "Intercept"),
        
        set_prior("normal(0,0.5)",
                  class = "sd")
)

abartonicek · May 13, 2020, 1:51am

Random effects might also be causing trouble. I see you have random intercepts across quite a few variables. Have you tried reducing the random effects structure & identifying where you get the problem?

JLC · May 13, 2020, 4:13am

After much trial and tribulation, I realized I wasn’t paying attention to the log scale of the priors…

I now have something that looks more reasonable, in my opinion:

sorted

abartonicek · May 13, 2020, 5:02am

What priors are you using now? For future reference.

Also, keep in mind that priors shouldn’t be decided on based on the distribution of the observed data (most of the time). Often, without informative theoretical expertise, they should just be chosen to provide sensible constraints on the parameter space & nothing more. So unless you have a prior theoretical reason to assume that picture 5 is going to be chosen more often than others, etc…, I don’t think it’s a good idea to have priors with informed location like the ones in your picture.

JLC · May 13, 2020, 3:56pm

Thanks again! This community is fantastic!

I have landed on these priors, at least for now:

priors <- c(
        set_prior("normal(0,1)",
                  class = "b",
                  coef = "condition80"),
        
        set_prior("normal(0,1)",
                  class = "b",
                  coef = "conditionCP"),
        
        set_prior("normal(0,1)",
                  class = "b",
                  coef = "conditionNFC"),
        
        set_prior("normal(0,1)",
                  class = "b",
                  coef = "groupHI"),
        
        set_prior("normal(0,1)",
                  class = "b",
                  coef = "new_categoryunpleasant_low"),
        
        set_prior("normal(0,1)",
                  class = "b",
                  coef = "new_categoryneutral"),
        
        set_prior("normal(0,1)",
                  class = "b",
                  coef = "new_categorypleasant_low"),
        
        set_prior("normal(0,1)",
                  class = "b",
                  coef = "new_categorypleasant_high"),
        
        set_prior("normal(-4,0.01)",  
                  class = "Intercept",
                  coef = "1"),
        
        set_prior("normal(-1.75,0.75)",  
                  class = "Intercept",
                  coef = "2"),
        
        set_prior("normal(-0.75, 0.75)",  
                  class = "Intercept",
                  coef = "3"),
        
        set_prior("normal(0, 0.75)",  
                  class = "Intercept",
                  coef = "4"),
        
        set_prior("normal(1,0.75)",  
                  class = "Intercept",
                  coef = "5"),
        
        set_prior("normal(1.75,0.75)",  
                  class = "Intercept",
                  coef = "6"),
        
        set_prior("normal(2.25,0.75)",  
                  class = "Intercept",
                  coef = "7"),
        
        set_prior("normal(4,0.01)",  
                  class = "Intercept",
                  coef = "8"),
        
        set_prior("normal(0,0.5)",  
                  class = "sd",
                  group = "participant",
                  coef = "Intercept"),
        
        set_prior("normal(0,0.5)",  
                  class = "sd",
                  group = "group:condition",
                  coef = "Intercept"),
        
        set_prior("normal(0,0.5)",
                  class = "sd",
                  group = "new_category:token",
                  coef = "Intercept")
        
)

Which looks like this in a prior predictive check:

pp_check_fin

While there is some theory to support the 5th picture, which is neutral, as being the most likely, my goal was just to try to shape the priors so that the probabilities were closer to even across the response choices.

At default, the priors suggest a much higher probability of the extremes, which isn’t supported by theory.

With the log-scale and inter-dependence of intercepts/thresholds, it took some puzzling out to find a set that looks relatively flat.

Any feedback on whether or not what I’ve done is reasonable is very welcome!

Topic		Replies	Views
Priors for mixture model with ordinal indicators brms	5	1588	February 20, 2020
Priors for category specific terms in ordinal logistic regression brms	1	569	June 22, 2021
Question about family = cumulative in brms Modeling ordinal-response , priors , brms	5	1540	April 4, 2023
Repeated measures ordinal model with brms - how to specify correlation structure brms specification , ordinal-response	1	1430	April 19, 2020
Brms: Prior setting for IRT 2PL ordinal models Modeling brms	4	584	May 15, 2021

Specifying Priors for Ordinal (adjacent category) Model

Related topics