Brms forest plot

Toovuu · January 20, 2021, 1:06am

Hi

I have a few questions regarding brms and its applications with tidybayes. The models I used are the following, where the parameter behandling_dummy is binary, 0 or 1:

model_1 <- brm(n|trials(N) ~ behandling_dummy + (1 + behandling_dummy|studie), 
             data = Lancet_data, family = binomial(), 
             prior = prior_sc)

model_2 <- brm(n|trials(N) ~ behandling_dummy*respirator_dummy + 
                       (1 + behandling_dummy*respirator_dummy|studie) + (1|infektion), 
             data = Lancet_data, family = binomial(), prior = prior_sc, 
             control = list(adapt_delta = 0.99, max_treedepth = 20))

I am trying to create forest plots with my models using tidybayes and the code I use to do this is taken from:

Instead of the intercept, I wanted study specific effect for behandling_dummy in odds-ratio so I had to change the code accordingly:

study.draws <- spread_draws(model_1, r_studie[studie,behandling_dummy], b_behandling_dummy)%>% 
        mutate(b_behandling_dummy = r_studie + b_behandling_dummy)

pooled.effect.draws <- spread_draws(model_1, b_behandling_dummy)%>% 
        mutate(studie = "Pooled Effect")

forest.data <- bind_rows(study.draws, pooled.effect.draws) %>% 
        ungroup() %>%
        mutate(b_behandling_dummy = exp(b_behandling_dummy)) %>% 
        mutate(studie = str_replace_all(studie, "[.]", " "))%>% 
        mutate(studie = reorder(studie, b_behandling_dummy))

forest.data.summary <- group_by(forest.data, studie) %>% 
        mean_qi(b_behandling_dummy)

ggplot(aes(b_behandling_dummy , relevel(studie, "Pooled Effect", after = Inf)), 
       data = forest.data)  + coord_cartesian(xlim = c(-1, 5)) + 
        geom_vline(xintercept = exp(fixef(model_1)[2, 1]), color = "grey", size = 1) +
        geom_vline(xintercept = exp(fixef(model_1)[2, 3:4]), color = "grey", linetype = 2) +
        geom_vline(xintercept = exp(0), color = "black", size = 1) +
        geom_density_ridges(fill = "blue", rel_min_height = 0.01, col = NA, scale = 1,
                            alpha = 0.8) +
        geom_pointinterval(data = forest.data.summary, size = 1) +
        geom_text(data = mutate_if(forest.data.summary, is.numeric, round, 2),
                  aes(label = glue("{b_behandling_dummy} [{.lower}, {.upper}]"), x = Inf), hjust = "inward") +
        labs(x = "Effect size",
             y = element_blank()) +
        theme_minimal()

My questions are:

Did I change the code to simulate a forest plot correctly according to my needs above?
Is it possible to add each studies weight in the plot?
How do I simulate a forest plot for the second model now that I have interaction with respirator_dummy which is also binary?

This is my first time using this so I hope my questions were clear. Any help is appreciated, thanks.

Operating System: MacOS
brms Version: 2.14.4

martinmodrak · January 30, 2021, 5:16pm

Sorry, it looks like your question fell through a bit, despite being relevant and well written.

Hard to say, there is definitely too much code for me to run in my head. Could you share your plot and summaries of the fits (or the underlying data, so that we can run the code for ourselves)?

I am not an expert on metaanalyses, but I don’t think the concept of weight has a direct counterpart in the Bayesian setting, but you might be able to compute something similar - I discussed this a bit at Questions on multilevel meta-analysis with brms - #4 by martinmodrak feel free to ask for clarifications if that part is not clear.

Tough call. I think you could show a separate forest plot for both conditions? Summarising inferences from bigger problems can be tricky, because there is just too much stuff to show…

Best of luck with your model!

P.S. note that you can use triple backticks (```) to mark code blocks - I edited your post for you.

Toovuu · February 2, 2021, 12:07am

Hello martinmodrak!

Thanks for the reply, it’s very appreciated.

Of course I can share my plots and data.

data651×945 20.1 KB

forest

Also my prior is

prior_sc <- set_prior("normal(-0.24,0.054)", class = "b", coef = "behandling_dummy")

What I am trying to do is to get the odds-ratio of face mask using a response variable with a binomial distribution with the logit link function where behandling_dummy is 1 for mask and 0 for no mask. n is the # infected and N is total # tested, so basically behandling_dummy = 0 is our control group.

Regarding weight. Yes, I understand now and will not add weights to my plot.
I tried to do similarly for interaction but I am pretty sure I’m wrong. My attempt was the following:

study.draws <- spread_draws(model_4, r_studie[studie,], b_behandling_dummy,
                            `b_behandling_dummy:respirator_dummy`)%>% 
        mutate(b_behandling_dummy = b_behandling_dummy + `b_behandling_dummy:respirator_dummy` +
                       r_studie)

and then resuse the above code for the rest.

If possible, can I possibly have a meeting with you through zoom/discord or similar?

Thanks again for the help!

Christopher-Peterson · February 2, 2021, 3:58am

It’s a little difficult to tell what exactly is going into your plot; it may be a good idea to define a few of the geom_vline xintercepts as part of the data frame, instead of as transformations in the plotting code (e.g., as part of a mutate() chain).

I can’t really tell where your geom_pointinterval is appearing in the output plot, I think the best way to include a measure of study weight would be by mapping sqrt(N) to its point_size aesthetic.

Toovuu · February 3, 2021, 11:14am

Hello Christopher-Peterson!

Thanks for your input!

[quote=“Christopher-Peterson, post:4, topic:20318”]
it may be a good idea to define a few of the geom_vline xintercepts as part of the data frame, instead of as transformations in the plotting code (e.g., as part of a mutate() chain).
[/quote]
How do I do this? I thought about extracting the values of ‘pooled effect’ but I do know how to extract an entire row based of a string index, similar to programming dataframe[studie = “pooled effect”] if possible. Or do you have a different solution?

2 [quote=“Christopher-Peterson, post:4, topic:20318”]
I can’t really tell where your geom_pointinterval is appearing in the output plot,
[/quote]
Yes, I have noticed that my interval is non-existent. Is there a solution for this?

[quote=“Christopher-Peterson, post:4, topic:20318”]
I think the best way to include a measure of study weight would be by mapping sqrt(N) to its point_size aesthetic
[/quote]
I’ve decided to not use weight anymore, but thank you for your input.

My questions still remain regarding if I did it correctly to get the odds-ratio of the behandling_dummy parameter and the odds ratio of the same parameter for the second model with interaction.

I also have some additional questions.

How do I remove the grid, I don’t even have grid in my code and it’s there?
Is there a way to change the colour of just the pooled effect?
Before transforming this is to odds-ratio from log-odds, some seemed to have bimodal distributions with two peaks. Is there a reason for this?

Thanks again for your reply!

Toovuu · February 8, 2021, 10:38pm

Hello again!

Regarding my bimodal question. This is what my plot looks like before transforming to odds-ratio.

Rplot

As can be seen from the plot, why does it look as such with two peeks giving the impression of bimodal distribution. The code is still the same as the link with changes mentioned above according to my needs.

My other problems have pretty much been solved.

martinmodrak · February 11, 2021, 4:07pm

This might be an issue with your model, but might also be harmless variability.

There always will be some sampling variability and thus - especiallly if you didn’t draw a lot of samples from the posterior - you would expect some wiggles in the kernel density estimates that should not be taken seriously. Plotting histograms and focusing only on a smaller number of the variables could make it easier to see what is happening (Kernel estimates can very easily be quite misleading).

Toovuu · February 12, 2021, 7:50pm

Hello martinmodrak!

Thanks again for answering.

I tried to create a model for a study which has a bimodal looking density.

Pei <- filter(Lancet_data, studie =="Pei")

model_0 <- brm(n|trials(N) ~ behandling_dummy, 
               data = Pei, family = binomial,
               prior = prior_sc)
plot(model_0, 'b_behandling_dummy')

And this is the output from the plot

Is it possible that since I want the studie-specific parameter behandling_dummy instead of the studie-specific Intercept I have to do the following instead

study.draws <- spread_draws(model_1, r_studie[studie,behandling_dummy], r_studie[studie,Intercept], b_behandling_dummy)%>% 
        mutate(b_behandling_dummy = r_studie[studie,Intercept] + r_studie[studie,behandling_dummy] + b_behandling_dummy)

i.e I also have to add the each studie deviation from the Intercept? Because that doesn’t make sense to me.

Could you also explain why combining geom_pointinterval and geom_density_ridges doesnt work for me?

Thanks again!

martinmodrak · February 16, 2021, 8:03am

This to me looks completely like a an artifact of sampling + the density estimate. A simple way to check would be to refit the model again and do the same plot. I would expect it to look similar, but have the wiggles at different locations. As I noted, histograms are often better for visualising this stuff as they (unlike kernel density) don’t introduce weird artifacts.

Also is your Rhat good? (You would get a warning if it wasn’t)

I am not sure I follow, but the question to ask is “what kind of prediction task am I interested in” - including different terms corresponds to different prediction tasks which are relevant to different scientific questions…

I have no idea, I only occasionally use ggdist or ggridges. But it is simply possible that ggdist and ggridges are not compatible for some reason. You should IMHO be able to build the plot using solely ggdist geoms which should play nicely together. I cannot advise on the specifics as I don’t have enough experience with either package.

Best of luck with your model!

Toovuu · February 16, 2021, 11:59pm

I did plot again and the wiggles are around at the same place.

I also filtered one of the studies which has the bimodal appearance using the following code.

study.draws_pei <- filter(study.draws, studie == "Pei")
hist(study.draws_pei$b_behandling_dummy + study.draws_pei$r_studie)

Which gave me the following output

And yes my R-hat is good and I got no warning regarding divergence or not enough samples.

Basically all I want is the value of the parameter behandling_dummy for each study which is in terms of my data is; the odds-ratio of face-mask for each study.

Thanks again for answering

martinmodrak · February 17, 2021, 7:46pm

OK, so this seems to not be an artifact of the kernel density. I can’t run your code myself, so can’t debug directly, but I would also try to double check for possible bugs.

For example, the mutate part in this code you posted earlier is a bit suspicious to me:

study.draws <- spread_draws(model_1, r_studie[studie,behandling_dummy], r_studie[studie,Intercept], b_behandling_dummy)%>% 
        mutate(b_behandling_dummy = r_studie[studie,Intercept] + r_studie[studie,behandling_dummy] + b_behandling_dummy)

. I am not 100% sure it is wrong, just that it is not a construct I can understand - either it is some advanced tidyverse/tidybayes feature or it does something very weird… (in fact, I cannot figure out what it should do beyond throwing an error)…

Do you get the same results if you try to compute the same value using hypothesis?

Toovuu · February 21, 2021, 9:43pm

Hello again martinmodrak!

I apologize it took so long to respond. I found a solution in a blog post written by A. Solomon Kurz. I had to filter out certain values with the use of the filter function with the following code.

study.draws <- spread_draws(model_1, r_studie[studie,term], b_behandling_dummy)%>% 
        filter(term == "behandling_dummy") %>% 
        mutate(b_behandling_dummy = b_behandling_dummy + r_studie)

Now it gives me a normal looking distribution. I even managed to get both a density with a point interval from the same blogpost by A. Solomon Kurz.

Now I just need the same thing but with my second model with the interaction. How do you write this code?

Best regards and thanks again for answering

martinmodrak · February 25, 2021, 9:24am

I don’t think I understand the question - if you need exactly the same thing, than the same thing should IMHO work. If you actually need something different (maybe somehow combining the main and interaction effect to get all 4 possible conditions?), then you need to be a bit more specific on what is the actual question. There are many possible questions that one can investigate with such a model…

Toovuu · February 26, 2021, 8:58pm

So since my other model is

model_2 <- brm(n|trials(N) ~ behandling_dummy*respirator_dummy + 
                       (1 + behandling_dummy*respirator_dummy|studie) + (1|infektion), 
             data = Lancet_data, family = binomial(), prior = prior_sc, 
             control = list(adapt_delta = 0.99, max_treedepth = 20))

and because I have an interaction with respirator_dummy, I should now have the spread draws like the following

study.draws <- spread_draws(model_2, r_studie[studie,term], b_respirator_dummy, b_behandling_dummy)%>% 
        filter(term != "Intercept") %>% 
        mutate(effect = b_behandling_dummy +  b_respirator_dummy + b_behandling_dummy*b_respirator_dummy + r_studie)

Because I want the respirator_dummy as well which is also binary. So basically the model I want

behandling_dummy + respirator_dummy + behandling_dummy*respirator_dummy

for each study and the overall effect.

martinmodrak · March 11, 2021, 10:39am

Oh, I had a half-written respond and forgot about this. Sorry for the delay.

I see what you mean.

I think you are actually trying to replicate the functionality of posterior_linpred or posterior_epred (or, in tidybayes world add_fitted_draws). What you are trying to compute - if I understood you correctly - is:

for the actual studies you just need posterior predictions for both the response without including the (1|infektion) term
for the overall effect you also want the response without including the (1 + behandling_dummy*respirator_dummy|studie) terms

Both of which should be achievable by using the re_formula argument to include only the group-level terms you need.

Best of luck

Topic		Replies	Views
Plotting logistic regressions with brms Modeling	3	578	December 9, 2023
Brms iformative prior for bionomial regression brms	21	1283	October 8, 2024
Results from monotonic effect versus dummy variables in brms Modeling	27	1533	March 14, 2020
Help with bayesian regression modeling (beginner) Modeling rstan , specification , brms	24	729	February 19, 2024
Binomial-normal (meta-analysis) model in brms Modeling brms , meta-analysis	5	1641	February 11, 2022

Brms forest plot

Related topics