Initial values in CmdStanPy for sampling

Hello all,

if I have an array of parameters, let’s say:

parameters {
        real theta[N,M];
}

and I want to provide initial values for all theta as a dictionary from CmdStanPy, so should every single theta be coded in the dictionary as {'theta.n.m':value....} or {'theta[n,m]':value.....}, or something else? I tried both and none worked…
Also, if I have multiple chains, should the length of the dictionary be X number of chains and have ‘chain_id’ as an additional argument?

Thanks!

1 Like

Hi, I think you should have theta that is ndim array

So with numpy create the N,M size matrix

import numpy as np
theta = np.empty(shape=(N,M))

and then create json file

import json
# use .tolist() for numpy arrays, so json dump works
stan_data = {"theta": theta.tolist()}
with open("path/to/datafile.json", "w") as f:
    json.dump(stan_data, f)

Each chain needs their own file. So create multiple files for each chain. I think you can then put each path to a dictionary where the chain_id is the key.

Ok, thanks! Just to make sure – same strategy applies for Stan vector and matrix parameter types as well, right?

yes. (for vectors I think simple lists are ok, no need for two dims there)

1 Like

I define the final dictionary as follows:
inits = {"1": "path/to/datafile_chain1.json", "2": "path/to/datafile_chain2.json"}
and then fit=model.sample(....., inits=inits, chains=2,....)
but I get

RuntimeError: Error during sampling.
chain 1 returned error code 70
chain 2 returned error code 70

It occurs even if I set all initial values in the json files to 0, which should be equivalent to inits=0.

Ok, I need to run some tests

1 Like

Ok, I think there must be some kind of error in data handling.

So, you can give one init but not multiple. What post-processing tools do you use? I think to hack this, you can sample one chain multiple times, changing the init file manually.

import numpy as np
theta = np.zeros(shape=(N,M))

import json
# use .tolist() for numpy arrays, so json dump works
inits = {}
for i in range(1,3):
    stan_inits = {"theta": theta.tolist()} # change this bit
    path = "path/to/initfile_{}.json".format(i)
    with open(path, "w") as f:
        json.dump(stan_data, f)
    inits[i] = path

and then call fit as

fits = []
for path in inits.values():
    fit = model.sample(inits=path)
    fit.append(fit)

You can combine the fits in ArviZ (see concat function for InferenceData)

https://arviz-devs.github.io/arviz/api/generated/arviz.concat.html?highlight=concat#arviz.concat

cc. @mitzimorris I think we should update all our input methods

6 Likes

Not sure if you remember, but since my output file if pretty big (larger than 2G), I am using a custom script you wrote and shared for handling huge csv output (and thank so much for this!). All regular Stan post-processing tools fail for my output size.

Yes, looks like running one chain at a time is what I will do in the meanwhile. Is there a way to parallelize it? I mean is there a problem if the same Stan model file is called from different/separate jobs (chains, in this case)? I can always make each chain to be a differently named model, but I try to avoid ballooning the number of files I have.

Ok, then yeah you already handle he files manually.

There should be no problem with calling the executable from different threads or processes.

Thank you!

1 Like