Sampling chains from the middle in cmdstanpy

Hello all,

I am trying to implement checkpointing in cmdstanpy and I want to make sure I am passing the right arguments to the next cycle. Based on these examples


and

one should pass the last interation value as init, the stepsize__ value as the step_size and turn adapt_engaged to False.
But what would be the CmdStanPy equivalent to PyStan inv_metric??

Thanks!

Ok, from a lot of digging, I think that cmdstanpy metric is the inv_metric equivalent.
Is it possible to calculate it from the cmdstan output.csv file?

Hi, in sample docstring

:param metric: Specification of the mass matrix, either as a
            vector consisting of the diagonal elements of the covariance
            matrix ('diag' or 'diag_e') or the full covariance matrix
            ('dense' or 'dense_e').
            If the value of the metric argument is a string other than
            'diag', 'diag_e', 'dense', or 'dense_e', it must be
            a valid filepath to a JSON or Rdump file which contains an entry
            'inv_metric' whose value is either the diagonal vector or
            the full covariance matrix.
            If the value of the metric argument is a list of paths, its
            length must match the number of chains and all paths must be
            unique.

Also the fit object has metric and stepsizemethods.

Then you just need to unpack metric items to individual files and have a list of files.

2 Likes

Thank you! But is it possible to calculate it from the output file?

here’s the link to the docs: https://mc-stan.org/cmdstanpy/api.html#cmdstanpy.CmdStanModel.sample

CmdStan docs have a (extremely simple) example: https://mc-stan.org/docs/2_24/cmdstan-guide/mcmc-config.html#specifying-the-metric-and-stepsize

1 Like

Thank you! Just to make sure - all I need to pass to next cycles is init, metric and step_size (If all other parameters were set to their defaults)?

Also adapt_engaged=False and add seed too. (Also warmup_iters=0). Then test that it actually works.

1 Like

Yes, thanks! I didn’t know about the seed – does it have to be the same seed for all cycles (including warmup)?

I just saw in the example that the seed is increased by 1 in every iteration. What is the reason for this?

Changes random numbers used.

1 Like

not sure if this applies to this use case - when running multiple chains, recommended procedure is to use same seed and use chain id to advance RNG - (cf. 9 MCMC Sampling using Hamiltonian Monte Carlo | CmdStan User’s Guide, section “Running multiple chains with a specified RNG seed”)