Setting smoother (sds) priors in brms gives errors in stan code (too many indices)

Dear all,

First off, I would like to thank Paul Buerkner for his excellent work on BRMS, it really makes my journey back into Bayes a lot easier. And of course the fine folks that are building and maintaining STAN!

I am trying to fit a simple non-linear equation of the following form

y = a0 + b2 * step(x - Omega) + s(z)

Which is evidently non-linear and contains a smoother term. Now, I wish to set the sds prior specifically on s(z), but I get an error that says:

Compiling Stan program...
SYNTAX ERROR, MESSAGE(S) FROM PARSER:
Too many indexes, expression dimensions=0, indexes found=1
 error in 'model31c85ce57963_54247924b736c726540d123ba2c2f7b4' at line 69, column 37
  -------------------------------------------------
    67:     - 1 * log_diff_exp(uniform_lcdf(1 | 0, 1), uniform_lcdf(0 | 0, 1));
    68:   target += normal_lpdf(b_a0[1] | 0.5, 0.1);
    69:   target += normal_lpdf(sds_a1_1_1[1] | 1, 5)
                                            ^
    70:     - 1 * normal_lccdf(0 | 1, 5);
  -------------------------------------------------

Error in stanc(file = file, model_code = model_code, model_name = model_name,  : 
  failed to parse Stan model '54247924b736c726540d123ba2c2f7b4' due to the above error.

And a warning message:

In addition: Warning message:
The global prior 'student_t(3, 0, 2.5)' of class 'sds' will not be used in the model as all related coefficients have individual priors already. If you did not set those priors yourself, then maybe brms has assigned default priors. See ?set_prior and ?get_prior for more details. 

The minimal-ish code to reproduce this is:

library(tidyverse)
library(brms)

x <- 1:100 / 100
z <- rnorm(100)
y <- 0.5 + (x > 0.5) * 2 + exp(z)


dat <- tibble(x = x, y = y + rnorm(length(y), 0, 0.2), z = z)

bform <- bf(
  y ~ a0 + b2 * step(x - omega) + a1,
  b2 + omega ~ 1,
  a0 ~ 1,
  a1 ~ 0 + s(z),
  nl = T
)

bprior <- prior(uniform(0, 1), lb = 0, ub = 1, nlpar = "omega") +
  prior(normal(2,0.1), nlpar = "b2") +
  prior(normal(0.5,0.1), nlpar = "a0", coef = "Intercept") + 
  prior(constant(1), nlpar = "a1", coef = "sz_1") + 
  prior(normal(1,5), nlpar = "a1", class = "sds", coef = "s(z)") 

m1 <- brm(bform, prior = bprior, data = dat, control = list(adapt_delta = 0.8),
          iter = 2000)

So, I have three questions. The first and most important is how do I specify the specific sds prior on s(z)? Although here it suffices to set the global sds prior, I will run in to situations where this is no longer the case.

Secondly, the warning message is something I can safely ignore, right? The defaults are only there as placeholders if I do not add in my own priors, if I understand correctly.

And thirdly, am I correct in thinking that the additional terms in non-linear equations should be interpreted as substitutions multiplied by a factor?

Finally, my session info:

R version 4.0.2 (2020-06-22)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 17134)

Matrix products: default

locale:
[1] LC_COLLATE=Dutch_Belgium.1252  LC_CTYPE=Dutch_Belgium.1252    LC_MONETARY=Dutch_Belgium.1252
[4] LC_NUMERIC=C                   LC_TIME=Dutch_Belgium.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] brms_2.14.4     Rcpp_1.0.5      forcats_0.5.0   stringr_1.4.0   dplyr_1.0.2     purrr_0.3.4     readr_1.3.1    
 [8] tidyr_1.1.2     tibble_3.0.3    ggplot2_3.3.2   tidyverse_1.3.0

loaded via a namespace (and not attached):
  [1] readxl_1.3.1         backports_1.1.9      plyr_1.8.6           imputeTS_3.1         igraph_1.2.6        
  [6] sp_1.4-2             splines_4.0.2        crosstalk_1.1.0.1    inline_0.3.16        rstantools_2.1.1    
 [11] digest_0.6.25        htmltools_0.5.0      rsconnect_0.8.16     fansi_0.4.1          magrittr_1.5        
 [16] openxlsx_4.1.5       modelr_0.1.8         RcppParallel_5.0.2   matrixStats_0.56.0   xts_0.12.1          
 [21] sandwich_3.0-0       prettyunits_1.1.1    forecast_8.12        tseries_0.10-47      strucchange_1.5-1   
 [26] colorspace_1.4-1     blob_1.2.1           rvest_0.3.6          haven_2.3.1          xfun_0.16           
 [31] callr_3.4.3          crayon_1.3.4         jsonlite_1.7.0       lme4_1.1-23          zoo_1.8-8           
 [36] glue_1.4.2           gtable_0.3.0         V8_3.4.0             pkgbuild_1.1.0       car_3.0-9           
 [41] rstan_2.21.2         quantmod_0.4.17      abind_1.4-5          scales_1.1.1         stinepack_1.4       
 [46] mvtnorm_1.1-1        DBI_1.1.0            rstatix_0.6.0        miniUI_0.1.1.1       xtable_1.8-4        
 [51] gridtext_0.1.1       foreign_0.8-80       StanHeaders_2.21.0-6 stats4_4.0.2         DT_0.15             
 [56] htmlwidgets_1.5.1    httr_1.4.2           threejs_0.3.3        ellipsis_0.3.1       pkgconfig_2.0.3     
 [61] loo_2.3.1            nnet_7.3-14          dbplyr_1.4.4         tidyselect_1.1.0     rlang_0.4.7         
 [66] reshape2_1.4.4       later_1.1.0.1        munsell_0.5.0        cellranger_1.1.0     tools_4.0.2         
 [71] cli_2.0.2            generics_0.0.2       broom_0.7.0          ggridges_0.5.2       evaluate_0.14       
 [76] fastmap_1.0.1        yaml_2.2.1           processx_3.4.4       knitr_1.29           fs_1.5.0            
 [81] zip_2.1.1            nlme_3.1-148         mime_0.9             projpred_2.0.2       bfast_1.6.0.9000    
 [86] xml2_1.3.2           shinythemes_1.1.2    compiler_4.0.2       bayesplot_1.7.2      rstudioapi_0.11     
 [91] curl_4.3             gamm4_0.2-6          ggsignif_0.6.0       reprex_0.3.0         statmod_1.4.34      
 [96] stringi_1.4.6        ps_1.4.0             Brobdingnag_1.2-6    lattice_0.20-41      Matrix_1.2-18       
[101] nloptr_1.2.2.2       markdown_1.1         shinyjs_2.0.0        urca_1.3-0           vctrs_0.3.3         
[106] pillar_1.4.6         lifecycle_0.2.0      lmtest_0.9-37        bridgesampling_1.0-0 data.table_1.13.0   
[111] raster_3.3-13        httpuv_1.5.4         R6_2.4.1             promises_1.1.1       gridExtra_2.3       
[116] rio_0.5.16           codetools_0.2-16     boot_1.3-25          colourpicker_1.1.0   MASS_7.3-51.6       
[121] gtools_3.8.2         assertthat_0.2.1     withr_2.2.0          shinystan_2.5.0      fracdiff_1.5-1      
[126] mgcv_1.8-33          parallel_4.0.2       hms_0.5.3            ggtext_0.1.0         quadprog_1.5-8      
[131] grid_4.0.2           timeDate_3043.102    coda_0.19-4          minqa_1.2.4          rmarkdown_2.5       
[136] carData_3.0-4        TTR_0.24.2           ggpubr_0.4.0         shiny_1.5.0          lubridate_1.7.9     
[141] base64enc_0.1-3      dygraphs_1.1.1.6     tinytex_0.25

Hi MJvdB,

Welcome to the forums!

For #2, yes, it is simply a warning that pops up when the user or default prior will be unused.

For #1 I believe that is a bug in brms. I couldn’t find anything about it on github, but before opening an issue there we should probably have @paul.buerkner confirm whether or not is is a bug.

An even simpler reprex would be the linear case:

 x <- 1:100 / 100
z <- rnorm(100)
y <- 0.5 + (x > 0.5) * 2 + exp(z)

dat <- tibble(x = x, y = y + rnorm(length(y), 0, 0.2), z = z) 

bprior <- set_prior("normal(0, 2)", class="b", coef="sz_1")+
  set_prior("normal(0, 3)", class="b", coef="sx_1")+
  set_prior("normal(0, 2)", class="sds", coef="s(z)")+
  set_prior("normal(0, 3)", class="sds", coef="s(x)")+
  set_prior("normal(0, 2)", class="sigma")+
  set_prior("normal(0, 2)", class="Intercept")

bform <- bf(
  y ~ 1+s(z)+s(x)
)
sdata=make_stancode(bform, data = dat,prior = bprior)

If you look at the stancode it is clear that sds_1_1 and sds_2_1 are defined as positive real numbers, but not vectors:

parameters {
  real Intercept;  // temporary intercept for centered predictors
  vector[Ks] bs;  // spline coefficients
  // parameters for spline s(z)
  // standarized spline coefficients
  vector[knots_1[1]] zs_1_1;
  real<lower=0> sds_1_1;  // standard deviations of spline coefficients        ### here
  // parameters for spline s(x)
  // standarized spline coefficients
  vector[knots_2[1]] zs_2_1;
  real<lower=0> sds_2_1;  // standard deviations of spline coefficients     ### and here
  real<lower=0> sigma;  // residual SD
}

And yet they are indexed as if they are vectors, which doesn’t happen with the default prior or a user defined global prior

model {
  // likelihood including all constants
  if (!prior_only) {
    // initialize linear predictor term
    vector[N] mu = Intercept + rep_vector(0.0, N) + Xs * bs + Zs_1_1 * s_1_1 + Zs_2_1 * s_2_1;
    target += normal_lpdf(Y | mu, sigma);
  }
  // priors including all constants
  target += normal_lpdf(Intercept | 0, 2);
  target += normal_lpdf(bs[1] | 0, 2);
  target += normal_lpdf(bs[2] | 0, 3);
  target += normal_lpdf(sds_1_1[1] | 0, 2)    ### here
    - 1 * normal_lccdf(0 | 0, 2);
  target += std_normal_lpdf(zs_1_1);
  target += normal_lpdf(sds_2_1[1] | 0, 3)    ### and here
    - 1 * normal_lccdf(0 | 0, 3);
  target += std_normal_lpdf(zs_2_1);
  target += normal_lpdf(sigma | 0, 2)
    - 1 * normal_lccdf(0 | 0, 2);
}

I even tried to bypass this by feeding a prior of the form:

prior_string("target += normal_lpdf(sds_1_1 | 2, 0.5)
    - 1 * normal_lccdf(0 | 2, 0.1)", check=F)

But that doesn’t help as the global prior still gets applied.

So we likely need input from a brms developer and you may need to open an issue on github.

2 Likes

From the op I guess they used a non-linear formula. But s(x) is not to be used there but in linear formulas. See the brms_multilevel vignette for examples.

Thanks Paul,

There may be a separate issue with the non-linear syntax, but I was able to reproduce it with a linear example above too. Unless I’m way off it does seem that defining a specific prior on the s(x) term rather than as a global sds prior results in the error OP posted. In all the ‘GAMs with brms’ examples and in the checks I could find in the brms github only global sds priors are applied…

Hugo

Oh I see. Thanks for the reprex. Will check today.

Thanks for replying!

I understand that I am not to use s(x) in the non-linear part, and therefore I placed it in the ‘flist’ argument. I understand that this is valid, and the results I get are what I expect. However, I fully admit to not 100% understanding the way non-linear equations are handled, so any criticisms are more than welcome.

I also checked the linear example and get the same problems when specifying the ‘local’ prior.

Cheers,
Martinus

The reported bug should now be fixed on github.

4 Likes

Thanks Paul, as MJvdB mentioned we all love brms and appreciate your responsiveness!