Convergence fails for (every) truncated gaussian model

Operating System: Windows 7, brms Version: 2.9.0
We try to analyse an effect developing in traumatized rats over time (saturating around 120 min). The problem is that effect could grow to be negative (in some rats) and positive (in other rats). While we analyse this direction as a separate (bernoulli) model, we want to assert the effect’s presence by modelling the absolute value of the effect (mpa variable). And this is where our problems started:

  1. Gaussian model mpa ~ Trt*Time converges nicely, but some of 95% HPD intervals span negative numbers. Sure one could cover this ugliness with “bar + only upper part of SEM” plots, but I really hope someone here could advise a better way.

  2. Truncated gaussian model mpa | trunc(lb=0) ~ Trt*Time simply NEVER converges. Tightening max_treedepth=13, adapt_delta=0.99 and iter=1e4 do not help (Rhat >= 8830, Eff.Sample = 2). Below is the reproducible example and EVERY truncated gaussian model I faked by playing with number of rats, effect sizes, etc. suffer from these convergence problems.

  3. I tried to use “inherently positive” distribution families, like LogNormal. Unfortunately our data is measured by rounding to the nearest 0.5mm and then averaged over 6 replicates, creating many “strict zeros” and breaking the positive distribution.

This is my fake example:

nRats <- 20
rats <- data.frame(ratNo=1:nRats, mu=rnorm(nRats)/3, sigma=rlnorm(nRats),
   TrtEffect=rnorm(nRats,mean=1)) # each rat is special
trueParams <- expand.grid(replicateNo=1:6, ratNo=1:nRats, Time=c(0,30,60)) %>%
  mutate(Trt = ifelse(ratNo <= nRats/2, "Sham", "TBI"),
         mu=rats$mu[ratNo]+ifelse(Trt=="Sham", 0, rats$TrtEffect[ratNo]*Time/10), 
         sigma = rats$sigma[ratNo] + 1 + mu/3)
data.measured <- trueParams %>% 
  mutate( pa = rnorm(n(), mean=mu, sd=sigma), # the "real" effect
          mpa = abs(round(pa*2)/2) )  # and its rounded measurement, producing many zeroes
data.recorded <- data.measured %>% ungroup %>%
  mutate(Trt = as.factor(Trt), Time = as.factor(Time), ratNo = as.factor(ratNo) ) %>%
  group_by(Trt, Time, ratNo) %>% summarize(mpa = mean(mpa))

m.mpa <- brm( data = data.recorded, family = gaussian,
   # mpa ~ Trt*Time, # converges, but fails to produce strictly positive 95% HPDI
   mpa | trunc(lb=0) ~ Trt*Time, # convergence fails horribly
   control=list(max_treedepth=13, adapt_delta=0.99),
   seed = 1, chains=3, cores=3, iter=1e4)

Of course, I have many other questions, like (4) how do I set reasonable priors if I know that untreated rats should have effect <1mm ( -1 <= pa <= 1), or (5) if it is worth to make Time a continuous factor, which will force us into non-linear models to account for saturation.

Just a note:

Why not use original data to model the (latent) “true” value and model that ?

BTW, ISTR that brms allows the use of distributions with non-negative support such as Raleygh and Wiener. Biut can you justify their use ?

Alternatively, it should be possible to derive the density of the absolute value of a normal, and use that to create a custom family for brms or use it doirectly in stan ; the relevant brms vignette is relatively straightforward to follow.

Your problem is likely that some observations are exactly at the truncation boundary (i.e. zero).
so your truncated normal model with lb = 0 is not actually a proper solution I think.

I would indeed also suggest not to use the absolute values but the original ones with the signs still existing. If you insist on using the absolute values, you may try out the zero_inflated_lognormal family to handle the zeros in some way.

Dear Paul, Dear Emmanuel, thank you for the speedy reply.

I installed the latest brms (version 2.9.6) from GitHub and tried your distribution advices:

  1. family = “zero_inflated_lognormal” (suggested by Paul), family = “raleygh” and family=“rayleigh” all give me the same error: “Error: zero_inflated_lognormal (or raleygh, or rayleigh) is not a supported family. Supported families are:…” Funny that has the keyword “rayleigh”, but the family=“rayleigh” is still unrecognized.

  2. family=“wiener” gives different error “Error: Addition argument ‘dec’ is required for family ‘wiener’.” Funny that brmsformula help do not mention “dec” in any way, saying wiener(link = “identity”, link_bs = “log”, link_ndt = “log”, link_bias = “logit”).

I am somewhat horrified to mess with custom distributions to define my very own “zero_inflated_normal” or “zero_inflated_lognormal”, but will try it tomorrow.

I would love to model the latent “pa” if I could, but multimodel + non-linear vignettes are hard to follow once I step away from their examples. I tried both brms 2.9.0 and 2.9.6 and got this:

try1 <- brm(
  bf( mpa ~ fabs(round(paLat*2)/2), nl=TRUE ) +
    lf( paLat ~ Trt*Time ) + gaussian(),
  data = data.recorded,
  prior = c(prior(normal(0, 3), nlpar="paLat")),
  # control=list(max_treedepth=13, adapt_delta=0.999),
  seed = 1, chains=3, cores=3, iter=4e4)

I am NOT sure what I am doing here and will appreciate suggestions/help with the following:

  1. With iter=4e4 model diverged with Rhat <= 1.64 and Bulk_ESS from 5 to 377. summary(try1) advises “We recommend running more iterations and/or setting stronger priors.” Funny that uncommenting control=… makes the divergence worse (Rhat <= 13.30).

  2. Our subject knowledge boils down to this: for untreated rats pa is usually between -1 and 1, while after treatment we saw pa to get to between -4 and 4. Could I set anything stronger that normal(0,1) to follow the advice of summary(try1) ? Also: normal(0,1) makes the divergence slightly worse (Rhat <= 1.67 and lower Bulk_ESS).

  3. What happened to Intercept in the linear part of the model? Why brm() complains “Error: The following priors do not correspond to any model parameter: Intercept_paLat ~ normal(0, 3)” for prior = c(prior(normal(0, 3), class = “Intercept”), prior(normal(0, 1), class = “b”)) ,while get_prior() clearly shows “Intercept” on line 3:

                    prior class          coef group resp dpar nlpar bound
    1 student_t(3, 0, 10) sigma                                          
    2                         b                               paLat      
    3                         b     Intercept                 paLat      
    4                         b        Time30                 paLat 
    ... lines showing Time60, TrtTBI, TrtTBI:Time30, TrtTBI:Time60 as expected
  4. How do I inform brm() that mpa is the result of averaging over 6 replicates? Should I aggravate convergence issues by defining 6 similar models paLat1, …, paLat6 and then setting mpa ~ ( fabs(round(paLat1 * 2)/2) + … + fabs(round(paLat6 * 2)/2) )/6 ?

Sorry, it should be hurdle_lognormal not zero_inflated_lognormal.

The rayleigh distribution does indeed not exist in brms and I think Emmanual ment it more like a general comment than a one specific for brms.

This strongly suggests to compute some estimate of this variability, and model that (possibly also modeling some estimate of the central tendency in a multivariate model: the vignette strongly hints at how to do this…). If I follow you, this variability is the essential feature of your results (it might be the case that your treatment doesn’t change the mean of mpa, but increases its variability).

Another way would be to explicitly model the variability of your original data, in what Paul Bueckner calls a distributional model. Again, the vignette is clear enough…

Indeed, I was thinking in rstan terms… although a Rayleigh distribution might be implemented in acustomfamily.