Model not updated with change in formula

I keep running into this problem where I modify a model, but when I re-run the code it’s not recompiled. Consistent with this, the model object doesn’t reflect the updates.

My understanding is that the model should recompile if I change the model’s formula, but I can only get the model to recompile if I remove the model file defined in the file = <FILE> argument from the file system.

For example, I initially defined a model as follows

file_name = "yeast_2c_est_phi"

yeast_2c_est_phi <- brm(
    bf(
    codon_index ~ inv_logit(dM + deta * Phi),
    dM ~ 0 + aa,
    deta ~ 0 + aa,
    Phi ~ gene_id,
    nl = TRUE
  ),
  prior(normal(0, 1), nlpar = "dM") +
    prior(normal(0, 1), nlpar = "deta") +
    prior(lognormal(0, 3), nlpar = "Phi", lb = 0),
  family = bernoulli(link = "identity"),
  data = ldata_2c[1:1000, ],
  init = 0,
  threads = 1, 
  chains = 4,
  cores = 4,
  iter = 0,
  save_model = file.path("stan.code", file_name),
  file = file.path("models", file_name),
  backend = "cmdstanr"
)

The model compiled, but then I realized I wanted to use Phi ~ 0 + gene_id rather than Phi ~ gene_id,

So I updated the code to be

yeast_2c_est_phi <- brm(
    bf(
    codon_index ~ inv_logit(dM + deta * Phi),
    dM ~ 0 + aa,
    deta ~ 0 + aa,
    Phi ~ 0 + gene_id,
    nl = TRUE
  ),
  prior(normal(0, 1), nlpar = "dM") +
    prior(normal(0, 1), nlpar = "deta") +
    prior(lognormal(0, 3), nlpar = "Phi", lb = 0),
  family = bernoulli(link = "identity"),
  data = ldata_2c[1:1000, ],
  init = 0,
  threads = 1, 
  chains = 4,
  cores = 4,
  iter = 0,
  save_model = file.path("stan.code", file_name),
  file = file.path("models", file_name),
  backend = "cmdstanr"
)

But when I submitted this code to the R kernel, it didn’t give any indication it recompiled the model. Indeed, if I print the model object I get

> yeast_2c_est_phi
 Family: bernoulli 
  Links: mu = identity 
Formula: codon_index ~ inv_logit(dM + deta * Phi) 
         dM ~ 0 + aa
         deta ~ 0 + aa
         Phi ~ gene_id
   Data: ldata_2c[1:1000, ] (Number of observations: 1000) 

The model does not contain posterior draws.

Which is weird. So I removed the model object using

> ls(yeast_2c_est_phi); rm(yeast_2c_est_phi); ls(yeast_2c_est_phi)
 [1] "algorithm" "autocor"   "backend"   "basis"     "cov_ranef" "criteria" 
 [7] "data"      "data.name" "data2"     "family"    "file"      "fit"      
[13] "formula"   "model"     "opencl"    "prior"     "ranef"     "save_pars"
[19] "stan_args" "stan_funs" "stanvars"  "threads"   "version"  
Warning in ls(yeast_2c_est_phi) :
  ‘yeast_2c_est_phi’ converted to character string
Error in as.environment(pos) : 
  no item called "yeast_2c_est_phi" on the search list

But if I reevaluated

yeast_2c_est_phi <- brm(
    bf(
    codon_index ~ inv_logit(dM + deta * Phi),
    dM ~ 0 + aa,
    deta ~ 0 + aa,
    Phi ~ 0 + gene_id,
    nl = TRUE
  ),
  prior(normal(0, 1), nlpar = "dM") +
    prior(normal(0, 1), nlpar = "deta") +
    prior(lognormal(0, 3), nlpar = "Phi", lb = 0),
  family = bernoulli(link = "identity"),
  data = ldata_2c[1:1000, ],
  init = 0,
  threads = 1, 
  chains = 4,
  cores = 4,
  iter = 0,
  save_model = file.path("stan.code", file_name),
  file = file.path("models", file_name),
  backend = "cmdstanr"
)

But I still didn’t give any indication it recompiled the model. Indeed, if I print the model object I still get

> yeast_2c_est_phi
 Family: bernoulli 
  Links: mu = identity 
Formula: codon_index ~ inv_logit(dM + deta * Phi) 
         dM ~ 0 + aa
         deta ~ 0 + aa
         Phi ~ gene_id
   Data: ldata_2c[1:1000, ] (Number of observations: 1000) 

The model does not contain posterior draws.

I can only get the model to recompile if I remove the model file defined in the file = file.path("models", file_name) argument.

My understanding is that the model should recompile if I change replace Phi ~ gene_id, with Phi ~ 0 + gene_id , but perhaps I’m missing something. Is the 0 superfluous if your working with a non-linear model?

sessionInfo()
R version 4.4.0 (2024-04-24)
Platform: x86_64-pc-linux-gnu
Running under: Ubuntu 22.04.4 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so;  LAPACK version 3.10.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

time zone: America/New_York
tzcode source: system (glibc)

attached base packages:
[1] stats     graphics  datasets  grDevices utils     methods   base     

other attached packages:
 [1] lme4_1.1-35.3    Matrix_1.7-0     rlang_1.1.3      stringr_1.5.1   
 [5] nnet_7.3-19      brms_2.21.0      Rcpp_1.0.12      cmdstanr_0.8.1  
 [9] popbio_2.8       bayess_1.6       combinat_0.0-8   gplots_3.1.3.1  
[13] mnormt_2.1.1     readr_2.1.5      tidyr_1.3.1      purrr_1.0.2     
[17] dplyr_1.1.4      gridExtra_2.3    ggplot2_3.5.1    testthat_3.2.1.1
[21] devtools_2.4.5   usethis_2.2.3    default_1.0.0

as long as the file specified in the file argument is on disc it will load that file no matter what else you do. if you want to more adaptive you can use file_refit = “on_change”

Thanks for the response (and the package!)

I’ve tried using “on_change”, but it doesn’t seem to change anything when I do, but let me leave that for another day.

Sticking to current point of confusion, I get the same failure to recompile even when the file argument is commented out. I’ve even tried removing the object itself, but to no avail. Here’s an example,

> rm(yeast_2c_est_phi)
> ls(pattern="yeast_2c_est_phi")
[1] "yeast_2c_est_phi_4000"
> yeast_2c_est_phi <- brm(
+     bf(
+     codon_index ~ inv_logit(dM + deta * Phi),
+     dM ~ 0 + aa,
+     deta ~ 0 + aa,
+     Phi ~ 0 + gene_id,
+     nl = TRUE
+   ),
+   prior(normal(0, 1), nlpar = "dM") +
+     prior(normal(0, 1), nlpar = "deta") +
+     prior(lognormal(0, 3), nlpar = "Phi", lb = 0),
+   family = bernoulli(link = "identity"),
+   data = ldata_2c[1:10000, ],
+   init = 0,
+   threads = threading(2), #threads within a chain
+   chains = 0,
+   cores = 4,
+   iter = 0,
+   # save_model = file.path("stan.code", file_name),
+   # file = file.path("models", paste0(file_name, "_empty")),
+   backend = "cmdstanr"
+ )
Model executable is up to date!
Start sampling
Running MCMC with 1 chain, with 2 thread(s) per chain...

Chain 1 WARNING: No variance estimation is 
Chain 1          performed for num_warmup < 20 
Chain 1 Iteration: 1 / 2 [ 50%]  (Warmup) 
Chain 1 Iteration: 2 / 2 [100%]  (Sampling) 
Chain 1 finished in 0.0 seconds.
Warning: 1 of 1 (100.0%) transitions ended with a divergence.
See https://mc-stan.org/misc/warnings for details.

Warning: E-BFMI not computed because it is undefined for posterior chains of length less than 3.
> ls(pattern="yeast_2c_est_phi")
[1] "yeast_2c_est_phi"      "yeast_2c_est_phi_4000"

I realize this is most likely a screw up on my end, so I greatly appreciate any insight you might be able to provide.

I am sorry, I am not sure I fully understand your problem in this case. Can you provide a minimial reprex for me that I can just execute to see the problem you are facing?

You are asking for 0 iterations. There is no error, the model is just complaining it can’t calculate some checks without iterations.

I don’t think that’s the problem. I get the same behavior when iter > 0.
The issue is that I’ve seemingly removed the model from memory via rm(yeast_2c_est_phi) and yet when issue the yeast_2c_est_phi <- brm(...) command R tells me that the Model executable is up to date! This is what I don’t understand.

So I believe I understand part of the issue. For each session R creates a temporary folder, e.g. /tmp/Rtmp1234. I can force the code to recompile after a change if I delete some/all of the objects in this folder.

So now my question is, where/what are the R objects that refer to these files?

Try attributes(yeast_2c_est_phi$fit)$CmdStanModel$exe_file() It’s where cmdstan(r?) puts your compiled models.
You can change the path via options(cmdstanr_write_stan_file_dir = path)).

I apologize for not having a mwe, but from what I can tell is there are two things going on.

First, that my modifications of the model formula (Phi ~ gene_id vs Phi ~ 0 + gene_id) didn’t actually change the model code, hence it didn’t recompile.

Second, file_refit = on-change doesn’t pay attention to changes in iter or warm-up.

This is documented in ?brm:

Refit will not be triggered for changes in additional parameters of the fit (e.g., initial values, number of iterations, control arguments, …).