I’m having a problem running a variational model. I’m getting the below error when I run a Linux server, but the exact same code works perfectly fine on a Mac.
> barcode_model$variational(experiment_stan_data, threads = 12)
------------------------------------------------------------
EXPERIMENTAL ALGORITHM:
This procedure has not been thoroughly tested and may be unstable
or buggy. The interface is subject to change.
------------------------------------------------------------
Gradient evaluation took 0.522638 seconds
1000 transitions using 10 leapfrog steps per transition would take 5226.38 seconds.
Adjust your expectations accordingly!
Begin eta adaptation.
Iteration: 1 / 250 [ 0%] (Adaptation)
Iteration: 50 / 250 [ 20%] (Adaptation)
Warning: Fitting finished unexpectedly! Use the $output() method for more information.
Finished in 309.5 seconds.
Notice, that I’m also running with 12 threads. On the machine where I get this error, I noticed that I didn’t see any threads being spawned in top. Turning off multithreading doesn’t change anything.
This is what I get from output():
method = variational
variational
algorithm = meanfield (Default)
meanfield
iter = 10000 (Default)
grad_samples = 1 (Default)
elbo_samples = 100 (Default)
eta = 1 (Default)
adapt
engaged = true (Default)
iter = 50 (Default)
tol_rel_obj = 0.01 (Default)
eval_elbo = 100 (Default)
output_samples = 1000 (Default)
id = 1 (Default)
data
file = <path>/standata-539dc49b86a.json
init = 2 (Default)
random
seed = 1912800785
output
file = <path>/barcode-sc-dynamics-202406172040-1-661654.csv
diagnostic_file = (Default)
refresh = 100 (Default)
sig_figs = -1 (Default)
profile_file = <path>/barcode-sc-dynamics-profile-202406172040-1-3b728b.csv
save_cmdstan_config = false (Default)
num_threads = 12 (Default)
------------------------------------------------------------
EXPERIMENTAL ALGORITHM:
This procedure has not been thoroughly tested and may be unstable
or buggy. The interface is subject to change.
------------------------------------------------------------
Gradient evaluation took 0.522638 seconds
1000 transitions using 10 leapfrog steps per transition would take 5226.38 seconds.
Adjust your expectations accordingly!
Begin eta adaptation.
Iteration: 1 / 250 [ 0%] (Adaptation)
Iteration: 50 / 250 [ 20%] (Adaptation)
Did you try doing what the error message suggested and running the $output() method (I have no idea what it does, but it’s where I’d start)?
The root cause of the problem is that the ADVI algorithm as coded in Stan is quite unstable. The arithmetic in C++ is platform dependent, so it’s not unusual to have slightly different behavior in Linux and on the Mac, especially if it’s an M1/2/3 Mac.
You might want to
try a different seed,
try to provide better inits (or at least a smaller uniform range than (-2, 2) default),
try a range of fixed eta values rather than adapting (I think this is possible, but am not 100% sure), or
try Pathfinder VI (which I’m not 100% sure is available in cmdstanr yet).
Multiple threads would only be spawned if you have parallelized code within your Stan program. I don’t know if CmdStan lets you run multiple optimizations in parallel.
Forget what I said about the threads. They are being spawned correctly and I’m using map_rect and it works fine. (I was incorrectly looking for processes not threads looking at the top output!) Could my use of map_rect be causing this problem (it works find on the Mac)?