Cmstan argument for different initial files for corresponding chains

In cross-chain warmup I need to allow different chains to have different init point. @rok_cesnovar @mitzimorris @bbbales2 @avehtari what’s your suggestion? I’m thinking in addition to init=filename we can allow init=directory name to point to a directory where cmdstan can read multiple init files.

And the directory would have some standard way of naming the files?

What about a comma separated list of files? Would that work?

Something like that is my preference

What if instead of 4 chains I have 12 chains so 12 init files?

Good point!

Hm, how about a wildcard?

init=subfolder/*.data.R

1 Like

I like this. Now I need to figure out how to do that in C++.

Boost filesystem would be my choice. C++17 filesystem is based on it I think.

There are a few ideas here: https://stackoverflow.com/questions/1257721/can-i-use-a-mask-to-iterate-files-in-a-directory-with-boost

1 Like

is this about calling CmdStan via CmdStanR?
CmdStanPy allows the following options for the inits argument:

  • inits –Specifies how the sampler initializes parameter values. Initialization is either uniform random on a range centered on 0, exactly 0, or a dictionary or file of initial values for some or all parameters in the model. The default initialization behavior will initialize all parameter values on range [-2, 2] on the unconstrained support. If the expected parameter values are too far from this range, this option may improve adaptation. The following value types are allowed:
    • Single number n > 0 - initialization range is [-n, n].
    • 0 - all parameters are initialized to 0.
    • dictionary - pairs parameter name : initial value.
    • string - pathname to a JSON or Rdump data file.
    • list of strings - per-chain pathname to data file.

https://cmdstanpy.readthedocs.io/en/latest/api.html#cmdstanpy.CmdStanModel.sample

ignore above - you’re talking about something in Torsten?

just added something to CmdStan that has to parse a comma-separated list - boost::algorithm::split is your friend - https://github.com/stan-dev/cmdstan/blob/465a2d2491e303d19fd06750a0a720299ef551dd/src/cmdstan/stansummary_helper.hpp#L193-L196

1 Like

Thanks. I’m talking about cmdstan, as I’m working on a branch that tests cross-chain warmup.

CmdStan already allows a strng of filenames as value for arg “init”

CmdStan allows a string value for arg “init”, so using the split on that string should work.

you’re controlling the multiple chains from a single invocation of command?
sorry for not keeping up with this one.

Yes

mpiexec -n 4 ./8school sample data file=8school.data.R init=init.R

runs 4 communicating chains using a same init.R, but I’d like to issue 1.init.R, 2.init.R, 3.init.R, and 4.init.R to corresponding chain.