Results from monotonic effect versus dummy variables in brms

I used pp_check and both model behaved well, also checked traceplot, they are good as well.
I need to practice simulation. Thanks for your advice.
On a differnt note, can i get individual-based posterior estimate? I mean posterior estimate for each person?
Thanks again.

If I understand the model above correctly then, no. If you want to model each individual I believe you need to use a varying intercept, i.e., (1 | individual). Perhaps @martinmodrak can provide some insights.

What I understood about varying intercept is that provides an oveall “sd” for the model, not for each individual.
In lme4 it is possible to perform post hoc analyses and store individual prediction value or estimate. Not sure if this is possible with brm or other Bayes models.

Well, if you have a varying intercept then you will be able to look at ranef(Model) for each individual.

Great. Thanks. I run the model and see how it is and will update here.
Thanks again for all.

Yes (if I understand the question correctly), you can use posterior_predict or posterior_linpred (the former includes the observation nose, the latter only the model uncertainty). You can also predict for new data. One method I’ve found useful in some cases is to make predictions for each person while varying only the factor of interest - see PPC for ordered_logistic? for a bit more detail and ideas by others.

In Bayesian models it is usually beneficial to work with all the posterior draws instead of just the mean estimate, although this can make the post processing a bit more involved. predict and fitted methods of brmsfit also offer setting summary = TRUE to get a single estimate if that’s what you prefer.

I believe @torkar might have interpreted your query as how to model individual-level variation, instead of obtaining individual-level prediction which might have led to some confusion (unless I am the one interpreting the conversation wrong :-) )

The difference in the model coefficients themselves is slightly surprising, but hard to interpret on its own as the intercepts have also changed noticeably. As you have correctly noted, using prediction is usually better for comparing differences.

Best of luck with your model!

1 Like

You got it right. I need individual estimate. The reason for that is I add up estimates of various predictors to create a total score for each person. Thanks for the hints. I need to learn them for first time. Thank you both for your time and help.

Still @torkar suggestion may work to consider each individual as a level! Just needs more chains as ids are not repeated!!

Do you have only one measurement/person? If that’s the case I believe it would not work.

Yes, in the models above its one measure/person. When we get estimates its mean estimate, I need to get personalized estimate. I will try predict

While I don’t think you should include varying intercept per person unless you have good reason to (e.g. from PP checks or theory), I think there is a small confusion about how that would work.

In my experience non-repeating ids are (usually) not a problem for Bayesian models and I have fitted individual-level models succesfully in brms on default settings (I don’t think there is any reason to run more chains), although all of them were with different families than cumulative.

The theoretical reason is that having a varying intercept for each observation is actually the same as using a more “overdispersed” distribution for observation model. AFAIK the only commonly used distribution where such an “overdispersed” alternative does not exist and the model ceases to be identifiable is the bernoulli (logistic regression). Still it is possible that for your particular dataset and model, varying intercept for each observation will introduce computational issues with the cumulative family and it will likely take a much longer time to fit. I however don’t see an a priori reason why it should not be possible. It is also not clear whether it would be actually of any use, in particular if it would be better than changing the link parameter to say cauchit.

But you have motivated me to do a little experiment of my own using individual-level intercept with cumulative response and it works more often than not, recovering the parameters from simulated data, although it needs slightly increased adapt_delta and not very wide prior to avoid computational issues (the effective sample size per iteration is on the low end of acceptable).

Here’s the code:

options(mc.cores = parallel::detectCores())

N <- 100
N_levels <- 6
intercepts <- sort(rt(N_levels - 1, 3))
varying_sd <- abs(rnorm(1, 0, 1))
test_data <- data.frame(id = 1:N, varying_effect = rnorm(N, 0, varying_sd)) %>%
  mutate(response_raw = rlogis(N) + varying_effect,
         # The response could probably be simulated in a more clever way
         response_ordinal = case_when(response_raw < intercepts[1] ~ 1,
                                      response_raw < intercepts[2] ~ 2,
                                      response_raw < intercepts[3] ~ 3,
                                      response_raw < intercepts[4] ~ 4,
                                      response_raw < intercepts[5] ~ 5,
                                      TRUE ~ 6))

# Priors matching how I simulated the data
priors <- c(set_prior("normal(0, 1)", class = "sd"), set_prior("student_t(3, 0, 1)", class = "Intercept"))
fit <- brm(response_ordinal ~  1|id, data = test_data, family = cumulative(), prior = priors, control = list(adapt_delta = 0.95))


Thanks for all the explanation and the experiment. I play with my data as well accordingly and will update here if anything interesting comes up.
Thanks again @martinmodrak and @torkar

Hi @torkar,

I was trying since long time now to find some literature on this claim. Usually, one always speaks of ~2SE being considered the “threshold”, but I haven’t found a paper claiming that.

Sorry for jumping in here.


Agreed, I’ve seen it being used in different circumstances and it varies between 2 and 6 SE. I should’ve written more explicitly:

[1] -23.576  -1.624

indicates that, relatively speaking, fit2 is better on the z_{95\%}-level. When I work with these things myself I often use 4 SE.

1 Like

Same here.

So far all my knowledge is based on forum entries where people argue where to draw the threshold!
I was just wondering if you happen to know any literature explicitly arguing for a certain threshold.


No, but if someone knows of a reference in this case (concerning model selection) it would be @avehtari :)

I did find some older comments of his on this matter (link below). But it was quite some time ago, so maybe one has a more solid idea in 2020?

1 Like

So, using z_{99\%} to stay on the safe side (and I usually have \gg n in my samples).

SE assumes that normal approximation works well for the expected difference. If it works then you can do whatever you would do with normal distribution. There is no recommended threshold for making binary decision as there is no single value which would be good for all cases and it is better to report the whole uncertainty instead of thresholded binary value (we don’t want new p-value almost 0.05 approaching significance issues). You are free to make your decisions based on the uncertainty, but please report the whole uncertainty. Another issue is that this normal approximation is not perfect. Some of the issues are known before and we’ll soon have a new paper out with more information about the failure modes, diagnostics and recommendations which will hopefully answer many your questions.


Thanks, @avehtari! This was very helpful!

1 Like