Plotting estimate + raw data, zero inflated poisson

Hi all,

To be able to modify the plot at will I want reproduce the plot obtained with:
plot(marginal_effects(MODEL_NAME), points=T)

As an example I used the data from
https://stats.idre.ucla.edu/r/dae/zip/
https://stats.idre.ucla.edu/stat/data/fish.csv

deleting a couple of extreme points for plot purposes:
fish<-read.csv(“https://stats.idre.ucla.edu/stat/data/fish.csv”)
fish$camper<-as.factor(fish$camper)
fish<-subset(fish, count!=149 & count!=65)

I ran the following ZERO-INFLATED model

mod_pois_zero = brms::brm(count ~ child + camper +(1| persons),
control = list(adapt_delta = 0.999, max_treedepth = 12),
iter = 6000,
cores = 4,
data = fish,
family = “zero_inflated_poisson”,
file=“mod_pois”)

I wrote the following script to plot estimate and raw data as for marginal_effects:

#extract posterior samples
d<-posterior_samples(mod_pois)

#select the variable(s) you want to plot
d_sub<-d[,c("b_Intercept", "b_child")]
#transform it to a matrix
d_sub1<-as.matrix(d_sub)

# to plot my simulation I have to extract the effect and add them to my original dataset
newdat<-data.frame(b_child = seq(-0,3, by = 0.025)) 
#if you choose the same length of your database you can add it easily
#put the entire range of your data

#create the matrix you are going to fill
Xmat <-model.matrix(~ b_child, data=newdat)
fitmat<-matrix(ncol=nrow(d), nrow=nrow(newdat))
#Fill the matrix
for(i in 1:nrow(d)) fitmat[,i] <- exp(Xmat%*%d_sub1[i,]) #poisson model
#for(i in 1:nrow(d)) fitmat[,i] <- Xmat%*%d_sub1[i,]      #gaussian model

#add the credible interval and fit
newdat$lower<- apply(fitmat,1,quantile, prob=0.025)
newdat$upper<-apply(fitmat, 1, quantile, prob=0.975)
newdat$fit<-apply(fitmat, 1, quantile, prob=0.5)

#newdat$fit<-exp(Xmat%*% fixef(mod_pois)) #fixef does not work in brms


#now you can either merge newdat with your original database or plot them directly

plot(fish$count ~fish$child,
     xlab =("b"), ylab = ("a"))
lines(newdat$b_child, newdat$fit, lwd=2)
lines(newdat$b_child, newdat$lower)
lines(newdat$b_child, newdat$upper)

the script works fine for norm and poisson models but the estimates with zero_inflated are wrong.

plot_brms

my_plot

How can I back-transform zero inflated models? clearly exp() is not enough.
Where can I find this information for the rest of the link functions ( e.g. skew normal, etc.)?

On a different note, the default plot plot(marginal_effects(MODEL), points=T) always cut the extreme points, any idea why?

Also if someone has a better strategy to plot raw data and estimates from brms I would be glad to know. Online I could not find any example that I was able to reproduce.

Thanks and best regards

This seems like a brms-related question. I would add the brms tag to your post so that the right people see it (looks like a plot someone will have produced before).

Thanks for the suggestion! I do not have the possibility to insert the brms tag, therefore I moved the question to a different section, is it what you meant?
Cheers and thanks

You can use the fitted method to create mean predictions without having to hand code anything based on fixef or whatever (which is a suboptimal approach in many ways anyway).

1 Like

Thank you for your reply Paul.
I will try to used fitted as suggested and give up on my approach.
Can you suggest on top of your mind blog and/or code to kick-start plotting raw data and linear models?

For categorical I used:
https://mjskay.github.io/tidybayes/articles/tidy-brms.html
For linear I found the following but I could not follow it:
http://rpubs.com/jwesner/gamma_glm

Much appreciated

I wrote the following to plot from fitted as suggested, however, I do not obtain the same values of plot(marginal_effects(MODEL), points=T)

This is likely due to 2 reasons:

  1. I cannot include the contribute of random factors re_formula = NA (I could for this simple example but for a complicated model with multiple random factor I cannot achieve a single length for “newdata”)
  1. I have to set my variable “camper” as having a single value, if I set it as a factor the other variables will take one level at the time and the result is a plot not possible to visualize

Anyone experienced the same difficulties?
Thanks again for any reply

newdata = data.frame(child=seq(from=0, to=3,length.out=248),
                         camper=seq(from=0, to=0,length.out=1))
    posterior_fitted2 <- fitted(mod_pois_zero, newdata=newdata, re_formula = NA,summary=T)
    posterior_fitted_plot2 <- as_tibble(posterior_fitted2) %>% 
      mutate(child = newdata$child) #adds the predictor values from newdata


ggplot()+
  geom_line(data = posterior_fitted_plot2, aes(x=child,y=Estimate, ymax=Q97.5, ymin=Q2.5),size=1.2, color="dodgerblue")+
  geom_ribbon(data = posterior_fitted_plot2, aes(x=child,y=Estimate, ymax=Q97.5, ymin=Q2.5),fill="grey50",alpha=0.2)+
  geom_point(data = fish, aes(x=child,y=count))

fitted

You may need to also set robust = TRUE in fitted.

1 Like

Thank you Paul much appreciated.

I need also to add that I found that in most cases it is not even necessary to use fitted.
Thanks to the fantastic versatility of brms it is possible to extract plotting values from marginal_effects (independently of the model link function).
fitted can be useful when some more customization is necessary.
Thanks again, very neat package.

for example:

marg<-marginal_effects(YOUR_MODEL)

NEWDAT<-marg$YOUR_VARIABLE_OF_INTEREST

ggplot()+
  geom_line(data =  NEWDAT  , aes(x= YOUR_VARIABLE_OF_INTEREST  ,y=estimate__, ymax=upper__ , ymin=lower__),size=1.3, color="blue")+
  geom_ribbon(data =  NEWDAT  , aes(x= YOUR_VARIABLE_OF_INTEREST ,y=estimate__, ymax=upper__, ymin=lower__),fill="skyblue4",alpha=0.3)+
  geom_point(data = ORIGINAL_DATASET, aes(x=ORIGINAL_X.z,y=ORIGINAL_Y), 
             color="blue",alpha=0.08) + 
  theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(),
        panel.background = element_blank(), axis.line = element_line(colour = "black"))