Interpreting posterior predictive checks on a longitudinal beta-binomial model of amphetamine use

I am applying the workflow laid out in Aki Vehtari’s case-study here. https://users.aalto.fi/~ave/casestudies/Nabiximols/nabiximols.html

The form of the outcome variable is the same as in the case-study - bounded count indicating days of substance use in the previous 28 days - but the sample is different, people receiving opioid agonist treatment (methadone or buprenorphine) at a public clinic, and the substance is different, amphetamines instead of cannabis. I am estimating the effect of frequency of amphetamine use at start of treatment - a time-invariant bounded count of days of amphetamine use in the previous 28 days - on frequency of amphetamine use during treatment, using a hierarchical model. Clients were monitored for a year, measured irregularly (this is routinely collected data, i.e. not from a clinical trial) so they could have anywhere between 1 and 11 follow-up measurements of past-28-day substance use in addition to the measurement at start of treatment/baseline.

The data are quite different from that in the case-study. The participants in the Sativex trial were all using cannabis daily or near daily, where the clients I am examining, being mostly people with opioid dependence issues, mostly use very little amephtamines. Also the time measure in this study is continuous but was discrete in the case-study data. The scatter plot looks like this. Dots are coloured based on days of use at baseline and, for intuition only, I include loess lines based on groupings of frequency of use at baseline.

Following the model-comparison approach in the case-study I ended up with a beta-binomial model. I include the brms model syntax below. outcome is the outcome variable, amephtamine use in previous 28 days, set is the possible number of days (always 28). ats_baseline is days of use at baseline (time-invariant predictor) and yearsFromStart is how many years from start of treatment each measurement was made (continuous predictor between 0 and 1).

fit_betabinomial2 <- brm(formula = outcome | trials(set) ~ ats_baseline*yearsFromStart + (1 | encID),
                         data = workingDF,
                         beta_binomial(),
                         prior = c(prior(normal(0, 3),
                                         class = Intercept),
                                   prior(normal(0, 3),
                                         class = b),
                                   prior(cauchy(0, 3),
                                         class = sd)),
                         save_pars = save_pars(all=TRUE),
                         seed = 1234,
                         refresh = 0,
                         cores = 4)

I cannot reproduce the the dataset here because it is protected and quite large, but I would like some advice about the posterior predictive checks. This is the histogram of posterior predictive replicates. It is pretty good. It predicts probability at the lower end of the scale around 0 days use, but it seems a bit deficient for the upper end, around 28 days.

And this is the pit-ecdf calibration plot.

Once again not bad but strays outside the envelope at the upper regions.

Finally a calibration check with a reliability diagram (once again copied from Aki Vehtari’s workflow) for 0 days use vs others

I could use some help interpreting these graphs (especially the final one) diagnosing problems, and with some suggestions for how to improve estimation (negative-binomial for instance?)

Regarding the posterior predictive checks, I agree with your interpretation of the histograms. The predicted number of counts for the higher end of the distribution look a bit low.

The “bump” at the top right of the LOO-PIT, and the slight U-shape (more visible if you plot the ecdf-difference version of the plot), indicate that the posterior is a bit light at the centre (the U-shape), but then also doesn’t quite predict high enough values for a considerable amount of cases (the bump). The later could plausibly be the higher number of high counts we see in the observation, but not in the predictive draws.

A peace by peace interpretation of the reliability diagram:

  1. A slight dip under the bounds at the bottom left:
    • Slight under confidence in some zero cases.
  2. Above bounds in the middle
    • Predicted probability of non-zero is too low for these cases.
  3. Long almost vertical part and back below bounds at the end.
    • I don’t know how many observations there are in this region, the histogram bars look quite low. The vertical parts indicate, that even though the predicted non-zero probability is increasing, the observations don’t support that.

I hope this helps with the interpretations.

3 Likes

In your workflow, did you try a zero-inflated beta-binomial model?
The Roaches case study might give ideas for model selection:
https://users.aalto.fi/~ave/modelselection/roaches.html

1 Like

Thank you for your response @ TeemuSailynoja. Yes I have considered the negative binomial. I am concerned that I may not be able to compare the negative binomial to the beta-binomial using loo or k-fold CV since (I assume) the outcome data takes different forms.

I meant using zero_inflated_beta_binomial() instead of beta_binomial(). You can use loo to for negative binomial, but as you say, it doesn’t really make sense for the observation model.

1 Like

Oh right apologies. Is that a separate family in brms?

Yes, in short, a zero-inflated version of the beta binomial distribution would estimate an additional probability for the zero counts, thus giving some additional freedom for the other parameters to explain the tail of the distribution.

I believe you could try it by first just changing the family from beta_binomial() to zero_inflated_beta_binomial(). Then, you can add a prior to the zero inflation and most likely would like to use the covariate to predict the zero inflation probability separately for each observation. This can be done by specifying a separate formula for inferring the parameter zero-inflation zi:

brm(
  formula = bf(outcome | trials(set) ~... , zi ~ ...),
  data = ....)

Paul Bürkner has a wignette on this: Estimating Distributional Models with brms. If you look at fit_zinb2 under the Zero-Inflated Models section, you will find an example.

2 Likes

@TeemuSailynoja I have noticed people, including yourself, using the term ‘observation model’. What does it mean? And why do you say the negative binomial doesn’t make sense?

Observation model (or data model) is a model for the observations p(y | x, \theta). The whole model includes also (hierarchical) prior terms p(\theta|\phi)p(\phi).

Thank you @avehtari so what then is the alternative to an observation model? What other types of model are there?

Priors are models for parameters and latent functions. Observation model is useful to make the distinction from the complete model, which is a combination of observation model and models for parameters and latent functions (and other similar non-observed quantities)

What I meant by saying that the negative binomial “doesn’t make sense” was just what I thought you meant when you said that the “outcome data takes different forms”. That is, my assumption would be that as the negative binomial would not be limited to the 0-28 interval, it would be surprising if it gave better predictive performance.

1 Like