Summarize then back-transform vs back-transform then summarize?

Hi all,

When working with posterior draws from a model fitted on a transformed scale (e.g., log or log10), what’s the correct way to summarize predictions on the original scale?

Two common approaches:

  1. Summarize → back‑transform: Compute the median and 95% HDI on the transformed scale, then back‑transform the summary (e.g., using exp() or 10^).
  2. Back‑transform → summarize: Back‑transform all posterior draws first, then compute the median and 95% HDI on the original scale.

These can yield different results when the posterior is skewed. Which approach is generally recommended, especially for reporting estimates and uncertainty intervals? Why?

In particular, when using emmeans on Bayesian models (with type = "response" or regrid = "response" and epred = TRUE), it appears to always use the second approach. This differs from the frequentist behavior, where type = "response" and regrid = "response" yield different results. Is there a way to obtain the first approach in the Bayesian setting, asides from manually back-transforming the estimates?

See also the emmeans transformation documentation:
https://rvlenth.github.io/emmeans/articles/transformations.html#regrid

Thanks in advance!

1 Like

I think you almost always want to default to doing the back-transformation as a final step. This also reflects how you model, where you can flexibly model a linear predictor before back-transforming it to the desired scale (e.g., positive, probability, etc). Paul Bürkner also writes about this in one of his draft chapters of the brms book.

It might be worth checking out the posterior and tidybayes R packages where you can really nicely work with the rvar datatype using fitted objects.

1 Like

Back‑transform → summarize is the appropriate approach in most cases since summary metrics are not transform invariant (see for instance: Wang et al 2018).

Wang X, Ryan Y, Faraway J (2018) Bayesian regression modeling with INLA. CRC Press, USA

2 Likes

Chiming in since we have two answers saying opposite things.

If what you want is a summary of the back-transformed output, then you have to back-transform first and summarize last, as @MilaniC says. The universal rule of thumb in analysis of MCMC draws is to summarize at the very end. Every computation that you can perform iteration-wise (i.e. draw-wise) properly preserves and propagates the posterior through the computation. This lets us propagate posterior uncertainty exactly (up to the MCMC error in the draws themselves) and is unique to the MCMC setting, which might by why frequentist software sometimes does something different.

Computations that are not performed iteration-wise, but rather are performed on summary statistics, do not achieve this in general. One notable exception is when the transformation is monotonic and the summary gives quantiles. Then the transforms of the summary will be identical to the summary of the transforms (up to variation in how the estimate is placed between two relevant draws–the median of the back-transformed pair won’t be the same as the back-transformed median of the pair, but as long as there are enough draws that the MCMC error in the estimates is small, this won’t be a practical concern). For example, if you want to give the central 95% interval based on the 2.5% and 97.5% quantiles, it doesn’t matter whether you summarize or back-transform first. But if you want to give the 95% HDI it does matter.

1 Like

What are some examples for when you’d want summaries of back-transformed output? I’d imagine the vast majority of use cases are going to want to make any computations with the posterior draws as @jsocolar nicely described and summarise at the very end. Again, I think it makes sense to think about doing your model in reverse to get to posterior predictions, which has the link functions as a final step before plugging into the distribution.

I think there’s some kind of semantic confusion here. If you have some output on the link scale and want to understand what’s going on on the back-transformed data scale, you back-transform first and then summarize. Always summarize last, after doing whatever (back) transformations you are interested in.

2 Likes