Crash Data Analysis with Fully Bayes on R

Hi,
I am fairly new to using R. Currently, I am trying to estimate posterior crash counts based on prior crash data using Fully Bayes on R. In my data, I have before and after periods crash counts, and additional information on related independent variables for both the before and after periods. I have previously predicted the prior crash frequency based on random effect negative binomial and a screenshot of the sample code is also attached here. As you will see, the first chunk of the code is working fine within “#ORT Random effects for Full Bayes”. However, the posterior estimation is where I am having trouble implementing.

Please help me with the code to estimate the posterior crash counts. Thank you so much for all your help.

Best
Meghna

I think the last line should be

post <- stan_glm.nb(RENB_Total ~ LN_AADT + offset(Ln_Miles) + TollRoad_After2, 
                    data = data)

but it does not let the intercepts vary. It also works to do

post <- stan_glmer(RENB_Total ~ LN_AADT + offset(Ln_Miles) + TollRoad_After2 + 
                   (1 | OBJECTID), family = neg_binomial2, data = data)
2 Likes

Thank you so much! This was extremely helpful. However, I was a little confused about which data to consider in the second part. I mean, I used the before period crash count (Total_Before), and toll road information (TollRoad_Before2) in the before period to develop the random effect negative binomial model (RENB_Total). Now in the posterior model, should I use that, and toll road information (TollRoad_After2) in the after period? Or should I use the observed crash count in the after period ((Total_After) as the dependent variable in the posterior model?

I would just reformat the data so that there is an indicator variable indicating before vs. after and then give all the data to stan_glmer after adding that indicator to the formula. There is no need to do it in two steps, and that actually makes it more confusing and difficult when you use glmer.nb.

Pefect! Thank you. So, I get the below output.

As the mean of posterior distribution is 11.7, and my sample size is 168, does that mean the total estimated posterior crash count will be 11.7*168? Am I understanding it right?

That is probably close to right, but it is not particularly Bayesian. To get the posterior (predictive) distribution of the total, you should do

PPD <- posterior_predict(post)
PPD_Total <- rowSums(PPD)
summary(PPD_Total)
2 Likes

Understood, thank you very much. This was indeed extremely helpful. I really appreciate it.