Convergence fails for (every) truncated gaussian model

VoidCaller · August 3, 2019, 10:06am

Operating System: Windows 7, brms Version: 2.9.0
We try to analyse an effect developing in traumatized rats over time (saturating around 120 min). The problem is that effect could grow to be negative (in some rats) and positive (in other rats). While we analyse this direction as a separate (bernoulli) model, we want to assert the effect’s presence by modelling the absolute value of the effect (mpa variable). And this is where our problems started:

Gaussian model mpa ~ Trt*Time converges nicely, but some of 95% HPD intervals span negative numbers. Sure one could cover this ugliness with “bar + only upper part of SEM” plots, but I really hope someone here could advise a better way.
Truncated gaussian model mpa | trunc(lb=0) ~ Trt*Time simply NEVER converges. Tightening max_treedepth=13, adapt_delta=0.99 and iter=1e4 do not help (Rhat >= 8830, Eff.Sample = 2). Below is the reproducible example and EVERY truncated gaussian model I faked by playing with number of rats, effect sizes, etc. suffer from these convergence problems.
I tried to use “inherently positive” distribution families, like LogNormal. Unfortunately our data is measured by rounding to the nearest 0.5mm and then averaged over 6 replicates, creating many “strict zeros” and breaking the positive distribution.

This is my fake example:

require("tidyverse")
require("tidybayes")
require("brms")
require("emmeans")
set.seed(1)
nRats <- 20
rats <- data.frame(ratNo=1:nRats, mu=rnorm(nRats)/3, sigma=rlnorm(nRats),
   TrtEffect=rnorm(nRats,mean=1)) # each rat is special
trueParams <- expand.grid(replicateNo=1:6, ratNo=1:nRats, Time=c(0,30,60)) %>%
  mutate(Trt = ifelse(ratNo <= nRats/2, "Sham", "TBI"),
         mu=rats$mu[ratNo]+ifelse(Trt=="Sham", 0, rats$TrtEffect[ratNo]*Time/10), 
         sigma = rats$sigma[ratNo] + 1 + mu/3)
data.measured <- trueParams %>% 
  mutate( pa = rnorm(n(), mean=mu, sd=sigma), # the "real" effect
          mpa = abs(round(pa*2)/2) )  # and its rounded measurement, producing many zeroes
data.recorded <- data.measured %>% ungroup %>%
  mutate(Trt = as.factor(Trt), Time = as.factor(Time), ratNo = as.factor(ratNo) ) %>%
  group_by(Trt, Time, ratNo) %>% summarize(mpa = mean(mpa))

m.mpa <- brm( data = data.recorded, family = gaussian,
   # mpa ~ Trt*Time, # converges, but fails to produce strictly positive 95% HPDI
   mpa | trunc(lb=0) ~ Trt*Time, # convergence fails horribly
   control=list(max_treedepth=13, adapt_delta=0.99),
   seed = 1, chains=3, cores=3, iter=1e4)

Of course, I have many other questions, like (4) how do I set reasonable priors if I know that untreated rats should have effect <1mm ( -1 <= pa <= 1), or (5) if it is worth to make Time a continuous factor, which will force us into non-linear models to account for saturation.

Emmanuel_Charpentier · August 3, 2019, 10:34am

Just a note:

Why not use original data to model the (latent) “true” value and model that ?

BTW, ISTR that brms allows the use of distributions with non-negative support such as Raleygh and Wiener. Biut can you justify their use ?

Alternatively, it should be possible to derive the density of the absolute value of a normal, and use that to create a custom family for brms or use it doirectly in stan ; the relevant brms vignette is relatively straightforward to follow.

paul.buerkner · August 3, 2019, 5:22pm

Your problem is likely that some observations are exactly at the truncation boundary (i.e. zero).
so your truncated normal model with lb = 0 is not actually a proper solution I think.

I would indeed also suggest not to use the absolute values but the original ones with the signs still existing. If you insist on using the absolute values, you may try out the zero_inflated_lognormal family to handle the zeros in some way.

VoidCaller · August 3, 2019, 7:50pm

Dear Paul, Dear Emmanuel, thank you for the speedy reply.

I installed the latest brms (version 2.9.6) from GitHub and tried your distribution advices:

family = “zero_inflated_lognormal” (suggested by Paul), family = “raleygh” and family=“rayleigh” all give me the same error: “Error: zero_inflated_lognormal (or raleygh, or rayleigh) is not a supported family. Supported families are:…” Funny that https://github.com/paul-buerkner/brms/blob/master/R/priors.R has the keyword “rayleigh”, but the family=“rayleigh” is still unrecognized.
family=“wiener” gives different error “Error: Addition argument ‘dec’ is required for family ‘wiener’.” Funny that brmsformula help do not mention “dec” in any way, saying wiener(link = “identity”, link_bs = “log”, link_ndt = “log”, link_bias = “logit”).

I am somewhat horrified to mess with custom distributions to define my very own “zero_inflated_normal” or “zero_inflated_lognormal”, but will try it tomorrow.

I would love to model the latent “pa” if I could, but multimodel + non-linear vignettes are hard to follow once I step away from their examples. I tried both brms 2.9.0 and 2.9.6 and got this:

try1 <- brm(
  bf( mpa ~ fabs(round(paLat*2)/2), nl=TRUE ) +
    lf( paLat ~ Trt*Time ) + gaussian(),
  data = data.recorded,
  prior = c(prior(normal(0, 3), nlpar="paLat")),
  # control=list(max_treedepth=13, adapt_delta=0.999),
  seed = 1, chains=3, cores=3, iter=4e4)

I am NOT sure what I am doing here and will appreciate suggestions/help with the following:

With iter=4e4 model diverged with Rhat <= 1.64 and Bulk_ESS from 5 to 377. summary(try1) advises “We recommend running more iterations and/or setting stronger priors.” Funny that uncommenting control=… makes the divergence worse (Rhat <= 13.30).
Our subject knowledge boils down to this: for untreated rats pa is usually between -1 and 1, while after treatment we saw pa to get to between -4 and 4. Could I set anything stronger that normal(0,1) to follow the advice of summary(try1) ? Also: normal(0,1) makes the divergence slightly worse (Rhat <= 1.67 and lower Bulk_ESS).

What happened to Intercept in the linear part of the model? Why brm() complains “Error: The following priors do not correspond to any model parameter: Intercept_paLat ~ normal(0, 3)” for prior = c(prior(normal(0, 3), class = “Intercept”), prior(normal(0, 1), class = “b”)) ,while get_prior() clearly shows “Intercept” on line 3:

                prior class          coef group resp dpar nlpar bound
1 student_t(3, 0, 10) sigma                                          
2                         b                               paLat      
3                         b     Intercept                 paLat      
4                         b        Time30                 paLat 
... lines showing Time60, TrtTBI, TrtTBI:Time30, TrtTBI:Time60 as expected

How do I inform brm() that mpa is the result of averaging over 6 replicates? Should I aggravate convergence issues by defining 6 similar models paLat1, …, paLat6 and then setting mpa ~ ( fabs(round(paLat1 * 2)/2) + … + fabs(round(paLat6 * 2)/2) )/6 ?

paul.buerkner · August 3, 2019, 7:54pm

Sorry, it should be hurdle_lognormal not zero_inflated_lognormal.

The rayleigh distribution does indeed not exist in brms and I think Emmanual ment it more like a general comment than a one specific for brms.

Emmanuel_Charpentier · August 5, 2019, 12:24pm

This strongly suggests to compute some estimate of this variability, and model that (possibly also modeling some estimate of the central tendency in a multivariate model: the vignette strongly hints at how to do this…). If I follow you, this variability is the essential feature of your results (it might be the case that your treatment doesn’t change the mean of mpa, but increases its variability).

Another way would be to explicitly model the variability of your original data, in what Paul Bueckner calls a distributional model. Again, the vignette is clear enough…

Emmanuel_Charpentier · August 5, 2019, 12:28pm

Indeed, I was thinking in rstan terms… although a Rayleigh distribution might be implemented in acustomfamily.

Topic		Replies	Views
Bugs for pp_check, brms -to- emmeans communication, and strange "features" brms emmeans	3	1367	August 12, 2019
Poor convergence using truncated count model vs. equivalent hurdle model brms	3	1078	February 12, 2019
Convergence failure (maybe) in brms brms	8	2647	May 10, 2018
Model fails to converge when using brms brms	4	1738	June 7, 2019
Choosing a sampling distribution for left skewed data brms	15	1488	March 20, 2024

Convergence fails for (every) truncated gaussian model

Related topics