Rejection statement from power operations in non-linear models (brms and JAGS)

Hello,

We are currently working on developing a package that provides a series of non-linear models in brms.

The input data for many of these models can vary in nature, such that we are allowing the user to specify the family distribution which gets passed to the brm call. Some models in particular are provided to model logistic-type decays on responses > 0.

And one of those particular models is giving us some grief because (I think) of a power operation. The data and brms formula are as follows:

library(brms)
library(tidyverse)
options(mc.cores = parallel::detectCores())

data <- "x,y
-2.30258509,0.99900000
-2.30258509,0.98080168
-2.30258509,0.94871489
-0.08773891,0.95349709
-0.08773891,0.93521762
1.01087337,0.79340627
1.01087337,0.76913433
2.21484618,0.33441376
2.21484618,0.33940514
3.31345847,0.06086657
3.31345847,0.02369004
4.51743127,0.01562192
4.51743127,0.00010000
5.61604356,0.00010000
5.61604356,0.00010000
6.82001636,0.00010000
6.82001636,0.00010000
-2.30258509,0.98306964
-2.30258509,0.95910799
-2.30258509,0.98633894
-1.29171172,0.92973615
-1.29171172,0.96875411
-0.59856454,0.95826182
-0.59856454,0.98481198
0.60540827,0.90160660
0.60540827,0.81905617
1.29855545,0.61950167
1.29855545,0.56095059
1.70402055,0.48796345
1.70402055,0.45580739
2.39716774,0.19080767
2.39716774,0.13426336
2.90799336,0.03420403
2.90799336,0.00010000" %>%
  read.csv(text = .)

brms_bf <- brms::bf(y ~ p1 * exp(-p2 * (x - p3)^p4 *
                      step(x - p3)),
                    p1 + p2 + p3 + p4 ~ 1, nl = TRUE)

p1 is the top part of the response, p2 drives the exponential decay past p3 (an x-axis based parameter), and p4 is a simple scaling exponent.

The data is fitted with a brms::Beta distribution using an identity link. So to help constrain the values in y, I set the following priors and initial values—the initial values were automatically obtained from the respective prior distributions:

priors <- brms::prior_string("beta(2, 1)", nlpar = "p1") +
          brms::prior_string("gamma(2, 0.1)", nlpar = "p2") +
          brms::prior_string("normal(1.3, 2.7)", nlpar = "p3") +
          brms::prior_string("normal(1, 10)", nlpar = "p4")

inits <- list(
  list(p1 = 0.5984, p2 = 15.2305, p3 = -2.3984, p4 = -4.9917),
  list(p1 = 0.7891, p2 = 18.8243, p3 = -1.9583, p4 = -2.6368),
  list(p1 = 0.8281, p2 = 22.9021, p3 = -0.6091, p4 = -2.6945),
  list(p1 = 0.5942, p2 = 3.1782, p3 = 3.7938, p4 = 5.8298)
)
fit <- brms:::brm(formula = brms_bf, data = data,
                  family = Beta(link = "identity"),
                  prior = priors, inits = inits)

It throws many initialisation rejection statements for all four chains until it gives up. The two types of error are (example from one chain):

Chain 2: Rejecting initial value:
Chain 2:   Error evaluating the log probability at the initial value.
Chain 2: Exception: beta_lpdf: First shape parameter[1] is nan, but must be > 0!  (in 'model3b187e8d16c5_3418b0f1329ef002320669e7285b7dae' at line 53)

Chain 2: Rejecting initial value:
Chain 2:   Error evaluating the log probability at the initial value.
Chain 2: Exception: beta_lpdf: Random variable[1] is 1.6553, but must be less than or equal to 1  (in 'model3b187e8d16c5_3418b0f1329ef002320669e7285b7dae' at line 46)

So the first rejection statement I think is related to having a negative base of floating numbers with an exponent in (x - p3)^p4 whereas the second is an issue of producing a value outside of the boundaries of the Beta distribution. I can easily live with the second because finding the right initial values will do the trick. However the second will always happen because x is continuous and so the operation x - p3 will always yield some negative base.

Is there a known trick that I could use to avoid this issue? Perhaps some re-parametrisation?

As a side note, just out of curiosity, I also tried fitting it in JAGS, and it seems to eventually work, please see code below:

library(R2jags)

jags_data <- list(x = data$x, y = data$y, N = nrow(data))

sink("jags_model.bug")
cat("
  model {
    for (i in 1:N) {
      y[i] ~ dbeta(shape1[i], shape2[i])
      shape1[i] <- theta[i] * phi
      shape2[i] <- (1 - theta[i]) * phi
      theta[i] <- p1 * exp(-p2 * (x[i] - p3) ^ p4 *
        step(x[i] - p3))
    }

    p1 ~ dunif(0.001, 0.999)
    p2 ~ dgamma(0.0001, 0.0001)
    p3 ~ dnorm(0, 0.0001)
    p4 ~ dnorm(1, 0.0001)
    phi ~ dgamma(0.0001, 0.0001)
  }", fill = TRUE)
sink()

jags_fit <- R2jags::jags(
  data = jags_data,
  parameters = c("p1", "p2", "p3",
                 "p4", "phi"),
  model = "jags_model.bug", n.iter = 5e4) %>%
  R2jags::autojags(n.iter = 5e4)

jags_fit

Inference for Bugs model at "jags_model.bug", fit using jags,
 3 chains, each with 50000 iterations (first 0 discarded)
 n.sims = 150000 iterations saved
          mu.vect sd.vect     2.5%      25%      50%      75%    97.5%  Rhat  n.eff
p1          0.933   0.019    0.890    0.923    0.936    0.946    0.963 1.001   9000
p2          1.017   0.296    0.449    0.802    1.032    1.224    1.575 1.004    790
p3          0.724   0.276    0.102    0.539    0.801    0.913    1.167 1.004    820
p4          0.972   0.202    0.625    0.832    0.951    1.094    1.414 1.003    980
phi        12.013   3.862    5.702    9.231   11.600   14.315   20.765 1.001 150000
deviance -144.176   3.714 -149.478 -146.864 -144.814 -142.198 -135.101 1.002   3300

For each parameter, n.eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor (at convergence, Rhat=1).

DIC info (using the rule, pD = var(deviance)/2)
pD = 6.9 and DIC = -137.3
DIC is an estimate of expected predictive error (lower deviance is better).

The effective sample size aren’t great, but at least the chains seemed well mixed based on R2jags::traceplot(jags_fit).

Thanks for your help!

> sessionInfo()
R version 4.0.1 (2020-06-06)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Catalina 10.15.6

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRblas.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib

locale:
[1] en_AU.UTF-8/en_AU.UTF-8/en_AU.UTF-8/C/en_AU.UTF-8/en_AU.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] R2jags_0.6-1    rjags_4-10      coda_0.19-3     forcats_0.5.0   stringr_1.4.0   dplyr_1.0.0     purrr_0.3.4     readr_1.3.1     tidyr_1.1.0     tibble_3.0.3    ggplot2_3.3.2   tidyverse_1.3.0
[13] brms_2.13.0     Rcpp_1.0.5     

loaded via a namespace (and not attached):
 [1] nlme_3.1-148            fs_1.4.1                matrixStats_0.56.0      xts_0.12-0              lubridate_1.7.9         threejs_0.3.3           httr_1.4.2              rstan_2.19.3           
 [9] tools_4.0.1             backports_1.1.8         R2WinBUGS_2.1-21        R6_2.4.1                DT_0.14                 DBI_1.1.0               colorspace_1.4-1        withr_2.2.0            
[17] tidyselect_1.1.0        gridExtra_2.3           prettyunits_1.1.1       processx_3.4.3          Brobdingnag_1.2-6       emmeans_1.4.8           compiler_4.0.1          rvest_0.3.5            
[25] cli_2.0.2               xml2_1.3.2              shinyjs_1.1             colourpicker_1.0        scales_1.1.1            dygraphs_1.1.1.6        mvtnorm_1.1-1           ggridges_0.5.2         
[33] callr_3.4.3             digest_0.6.25           StanHeaders_2.21.0-5    base64enc_0.1-3         pkgconfig_2.0.3         htmltools_0.5.0         dbplyr_1.4.4            fastmap_1.0.1          
[41] readxl_1.3.1            htmlwidgets_1.5.1       rlang_0.4.7             rstudioapi_0.11         shiny_1.5.0             generics_0.0.2          zoo_1.8-8               jsonlite_1.7.0         
[49] crosstalk_1.1.0.1       gtools_3.8.2            inline_0.3.15           magrittr_1.5            loo_2.2.0               bayesplot_1.7.2         Matrix_1.2-18           munsell_0.5.0          
[57] fansi_0.4.1             abind_1.4-5             lifecycle_0.2.0         stringi_1.4.6           pkgbuild_1.0.8          plyr_1.8.6              grid_4.0.1              blob_1.2.1             
[65] parallel_4.0.1          promises_1.1.1          crayon_1.3.4            miniUI_0.1.1.1          lattice_0.20-41         haven_2.3.1             hms_0.5.3               ps_1.3.3               
[73] pillar_1.4.6            igraph_1.2.5            boot_1.3-25             markdown_1.1            estimability_1.3        shinystan_2.5.0         codetools_0.2-16        reshape2_1.4.4         
[81] stats4_4.0.1            rstantools_2.1.0        reprex_0.3.0            glue_1.4.1              RcppParallel_5.0.2-9000 modelr_0.1.8            vctrs_0.3.1             httpuv_1.5.4           
[89] cellranger_1.1.0        gtable_0.3.0            assertthat_0.2.1        mime_0.9                xtable_1.8-4            broom_0.5.6             later_1.1.0.1           rsconnect_0.8.16       
[97] shinythemes_1.1.2       ellipsis_0.3.1          bridgesampling_1.0-0   
1 Like

Briefely, I think you should ensure that p4 is always positive (for example by using exp(p4) instead of p4) and then constrain your model with an additional function (such as inv_logit) to make sure predictions are always within [0, 1] for beta, or of course build the non-linear model straight away in a form that the [0, 1] are ensured.

1 Like

Hi Paul,

Thanks very much for your feedback. The exp(p4) suggestion has been incorporated already :-)

We deliberately did not code the equation to obey the [0,1] constraint because other data types (e.g. gaussian -Inf < x < Inf) might also be used depending on the research question.

As for the negative base power, I came up with a trick last night that solved the issue:

brms::bf(y ~ p1 * exp(-p2 * (step(x - p3) *
                    (x - p3))^exp(p4) * step(x - p3)),
                  p1 + p2 + p3 + p4 ~ 1, nl = TRUE)

Because the second step function would take the exponential to 0 anyway for the cases when x < p3, I just bypassed having to deal with a negative base by including a second step function.

I wonder though if this is a known issue in stan. For instance, a hypothetical illustration of what I mean in R:

x <- -0.6306855
p3 <- -0.6091697
p4 <- exp(0.9)
(x - p3) ^p4
[1] NaN

# but x - p3 = -0.0215158, so
-0.0215158^p4
-7.929488e-05

Which I think is related floating-point numbers:

(x - p3) == -0.0215158
[1] FALSE

# but
all.equal((x - p3), -0.0215158)
[1] TRUE

Is there a way to stabilise these operations in stan and brms::bf? Perhaps there is someone from the development team we could flag here?

Thanks again!

You cannot exponentiate negative numbers if the exponent is non-integer. Hence the NaN result. Your latter computation works accidentally because exponentiatiin is computed before -. That is you have actually done -(y^z) not not (-y)^z.

So perhaps having negafive p4 might be ok in your case (if it is in accordance with theory). But negatice (x - p3) is not.

1 Like

Oh I see! Thanks for the clarification Paul!

Would you have any idea as to how JAGS is coping or bypassing that operation though?

Unfortunately not.

No worries, thanks anyway. I’ll ask on the JAGS forum and will post any answers here for closure on the topic.
Thanks again for your time and for developing this amazing package!