Using rstan:optimizing() with brmsfit object

Is there an easy way to use rstan::optimizing() with a brmsfit object? My naive approach indicates that brms processes and renames the data associated variables.

compiled_model <- stan_model(model_code = brms_fit$"model")
rstan::optimizing(compiled_model, data = brms_fit$"data"), hessian = TRUE)

> Warning in FUN(X[[i]], ...) :
  data with name aa is not numeric and not used
> Warning in FUN(X[[i]], ...) :
  data with name codon_index is not numeric and not used
> Warning in FUN(X[[i]], ...) :
  data with name g_id is not numeric and not used
> Error : Exception: variable does not exist; processing stage=data initialization; variable name=N; base type=int (in 'anon_model', line 50, column 2 to column 17)
failed to create the optimizer; optimization not done

You can use make_stancode() and make_standata() to create the Stan program and data from the brms formula and R data frame. Here’s a simple example:

library(brms)
library(rstan)
code <- make_stancode(mpg ~ wt, data = mtcars)
data <- make_standata(mpg ~ wt, data = mtcars)

mod <- stan_model(model_code = code)
fit <- optimizing(mod, data = data)

That should get optimizing() to run (if your model is suitable for optimization, otherwise it will fail), but the parameter estimates probably won’t be named the way they are in standard brms output because brms does some post-processing.

Perfect. Do you by chance know if there’s a way to use this output for the init argument in brm?

I think it’s probably possible but would require some annoying code to reformat the output into the format require by the init.

However, I think it might be much easier if you’re open to using brms with the the cmdstanr backend instead of the rstan backend. Based on the brm() doc for the init argument, it seems like the inits are just processed by the backend. If that’s true then (thanks to @stevebronder) recent versions of cmdstanr allow passing a fitted model object as inits (whereas rstan requires reformatting the output into lists of lists).

I’ve never tried specifying inits that way for brms (so no guarantees it works), but using cmdstanr it might be quite easy:

library(brms)
library(cmdstanr)
code <- make_stancode(mpg ~ wt, data = mtcars)
data <- make_standata(mpg ~ wt, data = mtcars)

mod <- cmdstan_model(write_stan_file(code))
fit_opt <- mod$optimize(data = data) 

fit_brms <- brm(mpg ~ wt, data = mtcars, backend = "cmdstanr", init = fit_opt)

If that works then it should also be possible to use any model fitting method supported by cmdstanr to generate the inits. Using mod$optimize() will result in a single initial value being used. If you instead use mod$laplace() or mod$pathfinder() you should be able to get different initial values for the different MCMC chains when you run brms. For example:

fit_laplace <- mod$laplace(data = data) 

fit_brms <- brm(mpg ~ wt, data = mtcars, backend = "cmdstanr", init = fit_laplace)

Let me know if that works, I’ve never tried it before other than my toy example.

Jonah, thanks for your help on this. I am finding it very helpful in deepening my understanding of R and brms.

Unfortunately, I am running into errors with both approaches. I believe the init values created by the methods are not formatted correctly.

Here’s what I get, (note model was compiled with `backend = “cmdstanr”)

> code <- stancode(model)  # uses brms object
> data <- standata(model)
> mod <- cmdstan_model(write_stan_file(model$"model")) # create temporary file with code
Model executable is up to date!
> fit_opt <- mod$optimize(data = data, iter = 1E4) 
Initial log joint probability = -3.18939e+06 
    Iter      log prob        ||dx||      ||grad||       alpha      alpha0  # evals  Notes  
Error evaluating model log probability: Non-finite gradient. 
Error evaluating model log probability: Non-finite gradient. 
Error evaluating model log probability: Non-finite gradient. 
Error evaluating model log probability: Non-finite gradient. 
Error evaluating model log probability: Non-finite gradient. 
Error evaluating model log probability: Non-finite gradient. 
Error evaluating model log probability: Non-finite gradient. 
      99      -17562.2      0.989999       855.474      0.4151           1      132    
    Iter      log prob        ||dx||      ||grad||       alpha      alpha0  # evals  Notes  
     199        -16881      0.440666       83.7803           1           1      250    
    Iter      log prob        ||dx||      ||grad||       alpha      alpha0  # evals  Notes  
     299      -16789.7       3.65428       411.479           1           1      363    
    Iter      log prob        ||dx||      ||grad||       alpha      alpha0  # evals  Notes  
     399      -16706.1       0.18111       202.882      0.0573           1      471    
    Iter      log prob        ||dx||      ||grad||       alpha      alpha0  # evals  Notes  
     499      -16545.1       1.61231       135.575           1           1      584    
    Iter      log prob        ||dx||      ||grad||       alpha      alpha0  # evals  Notes  
     599        -16479     0.0118606       27.9198           1           1      694    
    Iter      log prob        ||dx||      ||grad||       alpha      alpha0  # evals  Notes  
     699      -16474.9    0.00416162        6.2371           1           1      808    
    Iter      log prob        ||dx||      ||grad||       alpha      alpha0  # evals  Notes  
     799      -16472.7     0.0569348        2.3115           1           1      926    
    Iter      log prob        ||dx||      ||grad||       alpha      alpha0  # evals  Notes  
     858      -16471.8    0.00773682      0.402875           1           1      992    
Optimization terminated normally:  
  Convergence detected: relative gradient magnitude is below tolerance 
Finished in  3.3 seconds.
> model2  <- update(model, backend = "cmdstanr", init = fit_opt)
Start sampling
Running MCMC with 4 sequential chains, with 5 thread(s) per chain...

Chain 1 Unrecoverable error evaluating the log probability at the initial value.
Chain 1 Exception: mismatch in number dimensions declared and found in context; processing stage=parameter initialization; variable name=z_1; dims declared=(1,7); dims found=(7) (in '/tmp/Rtmp3uiv2e/model-304fba437c45ec.stan', line 78, column 2 to column 29)
Warning: Chain 1 finished unexpectedly!

Chain 2 Unrecoverable error evaluating the log probability at the initial value.
Chain 2 Exception: mismatch in number dimensions declared and found in context; processing stage=parameter initialization; variable name=z_1; dims declared=(1,7); dims found=(7) (in '/tmp/Rtmp3uiv2e/model-304fba437c45ec.stan', line 78, column 2 to column 29)
Warning: Chain 2 finished unexpectedly!

Chain 3 Unrecoverable error evaluating the log probability at the initial value.
Chain 3 Exception: mismatch in number dimensions declared and found in context; processing stage=parameter initialization; variable name=z_1; dims declared=(1,7); dims found=(7) (in '/tmp/Rtmp3uiv2e/model-304fba437c45ec.stan', line 78, column 2 to column 29)
Warning: Chain 3 finished unexpectedly!

Chain 4 Unrecoverable error evaluating the log probability at the initial value.
Chain 4 Exception: mismatch in number dimensions declared and found in context; processing stage=parameter initialization; variable name=z_1; dims declared=(1,7); dims found=(7) (in '/tmp/Rtmp3uiv2e/model-304fba437c45ec.stan', line 78, column 2 to column 29)
Warning: Chain 4 finished unexpectedly!

Warning: All chains finished unexpectedly! Use the $output(chain_id) method for more information.

Warning: Use read_cmdstan_csv() to read the results of the failed chains.
Warning: No chains finished successfully. Unable to retrieve the fit.
Error: Fitting failed. Unable to retrieve the metadata.

This is what I get when I try using the laplace output


fit_laplace <- mod$laplace(data = data, opt_args = list(iter = 1E4), init = 0)
model3 <- update(model, backend = "cmdstanr",  init = fit_laplace)
Initial log joint probability = -84738.9 
    Iter      log prob        ||dx||      ||grad||       alpha      alpha0  # evals  Notes  
      99      -17902.9      0.117577        224.32           1           1      110    
    Iter      log prob        ||dx||      ||grad||       alpha      alpha0  # evals  Notes  
     199      -17841.9     0.0557256         28.49           1           1      217    
    Iter      log prob        ||dx||      ||grad||       alpha      alpha0  # evals  Notes  
     299      -17830.3     0.0161296       68.4053      0.3937           1      329    
    Iter      log prob        ||dx||      ||grad||       alpha      alpha0  # evals  Notes  
     399        -17828     0.0207503       43.5726      0.8657      0.8657      435    
    Iter      log prob        ||dx||      ||grad||       alpha      alpha0  # evals  Notes  
     499      -17827.1    0.00670058       12.3503           1           1      537    
    Iter      log prob        ||dx||      ||grad||       alpha      alpha0  # evals  Notes  
     599      -17826.5    0.00266746       12.3755           1           1      644    
    Iter      log prob        ||dx||      ||grad||       alpha      alpha0  # evals  Notes  
     699      -17825.5     0.0026736        6.6563           1           1      747    
    Iter      log prob        ||dx||      ||grad||       alpha      alpha0  # evals  Notes  
     799      -17825.3    0.00258842       5.34482           1           1      859    
    Iter      log prob        ||dx||      ||grad||       alpha      alpha0  # evals  Notes  
     899      -17825.2    0.00285119       19.5476      0.3879      0.3879      964    
    Iter      log prob        ||dx||      ||grad||       alpha      alpha0  # evals  Notes  
     999        -17825    0.00127582       5.40387           1           1     1071    
    Iter      log prob        ||dx||      ||grad||       alpha      alpha0  # evals  Notes  
    1099      -17824.9    0.00489369       5.29319           1           1     1178    
    Iter      log prob        ||dx||      ||grad||       alpha      alpha0  # evals  Notes  
    1199      -17824.8   0.000392955        3.2629           1           1     1288    
    Iter      log prob        ||dx||      ||grad||       alpha      alpha0  # evals  Notes  
    1299      -17824.8   0.000342069       1.87174           1           1     1397    
    Iter      log prob        ||dx||      ||grad||       alpha      alpha0  # evals  Notes  
    1399      -17824.7   0.000177246       5.78395      0.2039      0.2039     1506    
    Iter      log prob        ||dx||      ||grad||       alpha      alpha0  # evals  Notes  
    1499      -17824.7   0.000899042       1.85878      0.8683      0.8683     1611    
    Iter      log prob        ||dx||      ||grad||       alpha      alpha0  # evals  Notes  
    1599      -17824.6     0.0042151        6.3119       0.669       0.669     1716    
    Iter      log prob        ||dx||      ||grad||       alpha      alpha0  # evals  Notes  
    1699      -17824.5   0.000239514       4.90635      0.7316      0.7316     1826    
    Iter      log prob        ||dx||      ||grad||       alpha      alpha0  # evals  Notes  
    1762      -17824.5    0.00016997       1.69032           1           1     1892    
Optimization terminated normally:  
  Convergence detected: relative gradient magnitude is below tolerance 
Finished in  5.1 seconds.
Calculating Hessian 
Calculating inverse of Cholesky factor 
Generating draws 
iteration: 0 
iteration: 100 
iteration: 200 
iteration: 300 
iteration: 400 
iteration: 500 
iteration: 600 
iteration: 700 
iteration: 800 
iteration: 900 
Finished in  5.8 seconds.
> model3 <- update(model, backend = "cmdstanr",  init = fit_laplace)
Start sampling
Running MCMC with 4 sequential chains, with 5 thread(s) per chain...

Chain 1 Unrecoverable error evaluating the log probability at the initial value.
Chain 1 Exception: mismatch in number dimensions declared and found in context; processing stage=parameter initialization; variable name=z_1; dims declared=(1,7); dims found=(7) (in '/tmp/Rtmp3uiv2e/model-304fba437c45ec.stan', line 78, column 2 to column 29)
Warning: Chain 1 finished unexpectedly!

Chain 2 Unrecoverable error evaluating the log probability at the initial value.
Chain 2 Exception: mismatch in number dimensions declared and found in context; processing stage=parameter initialization; variable name=z_1; dims declared=(1,7); dims found=(7) (in '/tmp/Rtmp3uiv2e/model-304fba437c45ec.stan', line 78, column 2 to column 29)
Warning: Chain 2 finished unexpectedly!

Chain 3 Unrecoverable error evaluating the log probability at the initial value.
Chain 3 Exception: mismatch in number dimensions declared and found in context; processing stage=parameter initialization; variable name=z_1; dims declared=(1,7); dims found=(7) (in '/tmp/Rtmp3uiv2e/model-304fba437c45ec.stan', line 78, column 2 to column 29)
Warning: Chain 3 finished unexpectedly!

Chain 4 Unrecoverable error evaluating the log probability at the initial value.
Chain 4 Exception: mismatch in number dimensions declared and found in context; processing stage=parameter initialization; variable name=z_1; dims declared=(1,7); dims found=(7) (in '/tmp/Rtmp3uiv2e/model-304fba437c45ec.stan', line 78, column 2 to column 29)
Warning: Chain 4 finished unexpectedly!

Warning: All chains finished unexpectedly! Use the $output(chain_id) method for more information.

Warning: Use read_cmdstan_csv() to read the results of the failed chains.
Warning: No chains finished successfully. Unable to retrieve the fit.
Error: Fitting failed. Unable to retrieve the metadata.
> 

Any help would be appreciated.

Likely the same issue as described here: Using Pathfinder or other method to set initial values for sampling

And fixed here: Fixes 975 by only removing leftmost array dimension if equal to 1 by SteveBronder · Pull Request #993 · stan-dev/cmdstanr · GitHub

1 Like

Thanks for following up! If you install the development version of cmdstanr from GitHub it should include that fix you linked to. Does that resolve the problem? To install it you can use

remotes::install_github("stan-dev/cmdstanr")

Yup, updating cmdstanr fixed the issue. I do have one last question. Does the optimization take the prior into account?

If not, is there a way to do so?

Right now optimizing takes seconds while model fitting with rstan takes hours.

Ok great!

Yes, the objective function being optimized includes the prior.

1 Like

Hmmmm… it only takes seconds to optimize my non-linear to the data, but hours to fit using NUTS.

I’m curious to compare the posteriors between the numeric calculation of the hessian and the stan sampler. Are you aware of any pre-existing methods in the brms/cmdstanr tool kit to use the hessian from the optimization to estimate the posterior intervals?

Oh wait, I now see there appear to be PI in the laplace output, i.e.

variable mean median sd mad q5 q95
lp__ -38592.74 -38589.20 23.11 20.09 -38635.31 -38564.30
lp_approx__ -110.21 -110.00 9.99 9.34 -128.11 -94.50
b_int[1] 0.35 0.35 0.04 0.04 0.29 0.41
b_int[2] -1.18 -1.20 0.33 0.33 -1.71 -0.64
b_int[3] -0.34 -0.35 0.08 0.08 -0.46 -0.22
…


Is there a way to get the 2.5% and 97.5% intervals?
fit$summary(variables = NULL, function(x) quantile(x, probs = c(0.025, 0.975)))

Or you can extract the draws using fit$draws() and then summarize them using posterior::summarise_draws() or base R functions.

Hi Jonah,

I greatly appreciate your help with this. It’s definitely deepening my understanding of cmdstanr and R objects in general.

Unfortunately, here’s what I get when I try the suggested code

> fit$summary(variables = NULL, function(x) quantile(x, probs = c(0.025, 0.975)))
Error in `summary[, c("variable", "mean")]`:
! Can't subset columns that don't exist.
âś– Column `mean` doesn't exist.
Run `run:rlang::last_trace()` to see where the error occurred.
> rlang::last_trace()
<error/vctrs_error_subscript_oob>
Error in `summary[, c("variable", "mean")]`:
! Can't subset columns that don't exist.
âś– Column `mean` doesn't exist.
---
Backtrace:
    â–†
 1. └─fit$summary(...)
 2.   ├─summary[, c("variable", "mean")]
 3.   └─tibble:::`[.tbl_df`(summary, , c("variable", "mean"))

If I try the code in ?fit-method-summary I get,

> tmp <- fit$summary(variables = NULL, posterior::default_summary_measures()[1:4], quantiles = ~ posterior::quantile2(., probs = c(0.025, 0.975)), posterior::default_convergence_measures())
> str(tmp)
drws_smm [222 Ă— 2] (S3: draws_summary/tbl_df/tbl/data.frame)
 $ variable: chr [1:222] "lp__" "b_int[1]" "b_int[2]" "b_int[3]" ...
 $ estimate: num [1:222] -3.69e+04 2.37e-01 9.91e-01 7.87e-02 -5.32e-02 ...
 - attr(*, "num_args")= list()
> tmp
# A tibble: 222 Ă— 2
   variable    estimate
   <chr>          <dbl>
 1 lp__     -36890.    
 2 b_int[1]      0.237 
 3 b_int[2]      0.991 
 4 b_int[3]      0.0787
 5 b_int[4]     -0.0532
 6 b_int[5]      0.0168
 7 b_int[6]     -0.626 
 8 b_int[7]     -0.0319
 9 b_int[8]     -0.216 
10 b_int[9]      0.293 
# â„ą 212 more rows
# â„ą Use `print(n = ...)` to see more rows

On a related question, you suggest using the draws, but as I understand it the fit object created by mod$optimize() has only 1 draw. Is there a way to do, say rejection sampling using the optimize() object?

1 Like

Oh I was thinking about the Laplace one, which has multiple draws. Instead of trying to do some sort of rejection sampling with the optimization object, I would just use the Laplace one. We added the Laplace method specifically to let people get approximate posterior draws after running optimization.