Does the init argument work with cmdstanr?

Hi,

I’m trying to use cmdstanr to fit a model and use the init argument, as indicated here: https://mc-stan.org/docs/2_24/cmdstan-guide-2_24.pdf.

The model I’m working on is still in development, so I’d rather not share it, but I can build a minimal example if needed.

Basically, I can initialize scalar parameters but not vector parameters. I’ve tried using Rdump, Json, and passing a list of list. Same behavior. The list I write to Rdump may look as follows:

init <- list(k = 1, p0 = c(0, 1))

and produces a file

k <- 1
p0 <- 
c(0, 1)

k starts at 1, but the other parameters are randomly sampled, it seems…

@rok_cesnovar @jonah

Hey Charles, inits for vector parameters seem to work for me. For example:

library(cmdstanr)

file <- write_stan_tempfile(
  "parameters {
   real alpha;
   vector[2] beta;
  }
  model {
   alpha ~ normal(10, 1);
   beta ~ normal(-10, 1);
  }
  "
)
mod <- cmdstan_model(file)

inits_chain1 <- list(alpha = 15, beta = c(-5, -10))

fit <- mod$sample(chains = 1, init = list(inits_chain1), save_warmup = TRUE)
draws <- fit$draws(inc_warmup=TRUE)
posterior::as_draws_matrix(draws)[1, ]

The first row has the values of my inits:

# A draws_matrix: 1 draws, and 4 variables
    variable
draw lp__ alpha beta[1] beta[2]
   1  -25    15      -5     -10

If you’re seeing different behavior a minimal example demonstrating it would be super helpful.

1 Like

Here’s a minimum example that fails:

file <- write_stan_tempfile(
  "
  parameters {
   real alpha;
   real beta[2];
   real gamma[2];  
  }
  model {
   alpha ~ normal(10, 1);
   beta ~ normal(-10, 1);
   gamma ~ normal(8, 1);
  }
  "
)
mod_test <- cmdstan_model(file)

inits_chain1 <- list(alpha = 15, beta = c(-5, -10),
                     gamme = c(3, 3))

fit <- mod_test$sample(chains = 1, init = list(inits_chain1), 
                       save_warmup = TRUE, seed = 123,
                       data = list(n = 2))

draws <- fit$draws(inc_warmup=TRUE)
posterior::as_draws_matrix(draws)[1, ]

Works with one parameter array, fails with two (i.e. adding gamma).

# A draws_matrix: 1 draws, and 6 variables
    variable
draw lp__ alpha beta[1] beta[2] gamma[1] gamma[2]
   1  -79   7.4     -16    -9.3       15       16

Looks like there’s a typo: gamme should be gamma. Does that fix it or does it still not work?

If it’s still broken I’ll definitely investigate.
If the problem was just the typo then ideally CmdStan would warn that inits are provided for parameters that don’t exist but CmdStan seems to be silent about that (@mitzimorris @rok_cesnovar do you know if this would be easy or hard to do at the CmdStan level)?

the mechanism that CmdStan uses is that the inits are passed in as a stan::var_context - it’s perfectly OK for this object to contain all sorts of variables that aren’t needed by the Stan program.

to tighten that up, you’d need to add logic to CmdStan to interrogate the instantiated model to get the names of all parameter variables (which is doable), and then check that the var_context object contains only those variables (also doable) - but a PITA, and then the CmdStanX interfaces would have to be able to relay that information to the user gracefully.

1 Like

@jonah You’re right there is a typo. Even then, the initial estimate for alpha and beta is off.

Now, here’s another example that fails. Maybe I did another careless mistake, but I can’t spot it.

file <- write_stan_tempfile(
  "
  parameters {
    real<lower = 0> k;
    real q0[2];
    real p0[2];
  }

  model {
    k ~ normal(0, 1);
    q0 ~ normal(0, 1);
    p0 ~ normal(0, 1);
  }
"
)
mod_test <- cmdstan_model(file)

# inits_chain1 <- list(k = 0.05, q0 = c(0, 1), p0 = c(0, 1))
inits_chain1 <- list(k = 0.05, q0 = c(1, 0), p0 = c(0, 1))

fit <- mod_test$sample(chains = 1, init = list(inits_chain1), 
                       save_warmup = TRUE, seed = 123)

draws <- fit$draws(inc_warmup=TRUE)
posterior::as_draws_matrix(draws)[1, ]

which produces

# A draws_matrix: 1 draws, and 6 variables
    variable
draw lp__    k q0[1] q0[2] p0[1] p0[2]
   1 -3.5 0.27  -1.7   0.3  -0.2  -1.1

Thanks Charles! I’ll look into this.

So this is interesting. If I run your example and then look at the JSON inits file generated by CmdStanR it looks right. The call to CmdStan that’s used also looks right.

Here are the arguments that are passed to CmdStan:

> fit$runset$command_args()
[[1]]
 [1] "id=1"                                                                                                       
 [2] "random"                                                                                                     
 [3] "seed=123"                                                                                                   
 [4] "init=/var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T/RtmpXMfApH/init-1-a0e9315ac587.json"                  
 [5] "output"                                                                                                     
 [6] "file=/var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T/RtmpXMfApH/filea0e91c8efd8f-202008051615-1-5e3b61.csv"
 [7] "method=sample"                                                                                              
 [8] "save_warmup=1"                                                                                              
 [9] "algorithm=hmc"                                                                                              
[10] "engine=nuts"                                                                                                
[11] "adapt"                                                                                                      
[12] "engaged=1" 

The JSON file written by CmdStanR for the init argument has these contents:

{
  "k": 0.05,
  "q0": [1, 0],
  "p0": [0, 1]
}

which looks correct.

So I have two questions for CmdStan experts @mitzimorris and @rok_cesnovar:

  • Is this the right way the JSON file should be formatted for specifying inits? (I think it is but double checking)

  • Is the first iteration returned by CmdStan (when save_warmup is TRUE) supposed to contain the inits? Or is it the first iteration after the inits? If the latter then maybe nothing is wrong here?

the first iteration is after the inits, so maybe nothing is wrong here.

Thanks Mitzi, that’s good to know!

So I guess in my example in my initial reply (all the way above) where it looked like the first iteration was the inits I was just “lucky” that there wasn’t a change between the inits and the first iteration (or if there was it was in decimal places not show above)?

Has there even been any discussion of CmdStan returning inits too? That would actually be really useful because I guess there’s no way to tell if CmdStan is actually using the specified inits? There’s also no way to see what initial values CmdStan generates for parameters for which the user hasn’t specified inits.

emphasis on maybe -
just found this open bug on CmdStan - report initialization as first draw · Issue #547 · stan-dev/cmdstan · GitHub

I don’t think so. It seems very unlikely that the chain would basically not move. If you change the seed, the result persists. So I’m fairly confident that what you’re printing is indeed the initial estimate.

You can pass the path to a hand-coded JSON or Rdump file, and still get the same error.

What’s more, what gets printed in the failing example I posted is widely different from the init. I doubt such a jump is possible. In the full model, this kind of change means you’d be jumping between modes during the first iteration, but never again during the warmup and the sampling phases.

1 Like

This one does seem werid. The problem is not in cmdstanr, that is sure and it also seems that the init is read and used. If I put in some vastly different value I get different results.

inits_chain1 <- list(k = 100, q0 = c(1, 0), p0 = c(0, 1))

fit <- mod_test$sample(chains = 1, init = list("1" = inits_chain1), 
                       save_warmup = TRUE, seed = 123)

draws <- fit$draws(inc_warmup=TRUE)
posterior::as_draws_matrix(draws)[1, ]

prints

# A draws_matrix: 1 draws, and 6 variables
    variable
draw lp__  k q0[1]  q0[2] p0[1] p0[2]
   1 -439 30     1 -0.014 0.008  0.98

Also no change between 2.23 and 2.24 cmdstan.

hmm…

For some models, I consistently get the specified initial points for the first draw, even when I change the seed. To me that’s a very strong hint the first draw is the initial point.

My wild guess is that in some cases, the initial point is treated as the starting point of the Markov chain (as it should) and at other times as the range of the uniform from which the init gets drawn. So k = 100 causes k \sim \mathrm{uniform}(-100, 100). And indeed, if I change the seed the first draw is different.

The first saved value of the parameters when save_warmup is set to true is the result of the first transition from the initial, not the initial state itself. The initial state is not saved.

Relevant code: https://github.com/stan-dev/stan/blob/develop/src/stan/services/util/generate_transitions.hpp.

5 Likes

The initial values do work after all.

I used @avehtari’s trick to run the model with fixed parameters and the initial values were recovered in all test cases.

In my model, the wild departures from the initial estimates were due to the very high curvature near k = 0, and the fact that k is strongly correlated to p_0 and q_0. At least, that’s my conjecture.

I still do not understand why sometimes the first iteration (after the first transition) is exactly equal to the initial estimates. I observe this behavior in cases where there are no divergent transitions.

4 Likes

that’s a neat trick - given that you can do this, do you think there’s any need to hack a report of initial values into the Stan CSV file’s initial comment section? if not, I’ll close the issue on it: https://github.com/stan-dev/cmdstan/issues/918 - and we can put a note in the CmdStan manual - it’s a good use case for this particular feature.

Definitely a neat trick, but I don’t think we should have to rely on it in order to see the initial values. CmdStan should have a built-in way of informing the user which initial values were used without requiring the user to run with fixed_param separately.

Can CmdStan just write a separate CSV or JSON file with the inits? That seems preferable to hacking this into the comment section, which is already a mess and difficult to work with. I find it very frustrating to have to parse the comments to get meaningful information. Metadata sure, but things like inits and mass matrix estimates aren’t metadata.

1 Like

if you want to know what the full set of initializations chosen for all parameters, you will need to get that information from the stan::services layer - the initialize function takes the user specified values or init radius and creates a stan::io::var_context object with initial values for all model parameters. I don’t think we have a good way of dumping the contents of a var_context object as text - I might be wrong. see https://github.com/stan-dev/stan/blob/ea71828c75ca084fb2e978e57b4d143a39f4a224/src/stan/services/util/initialize.hpp#L69. given this, very much appreciate Aki’s trick.

CmdStan only knows what the user-specified inits are - this information is already reported in the Stan CSV file header comments section.

entirely agree, but the problem needs to be addressed at the stan::io / stan::services layer.

1 Like

Thanks for clarifying @mitzimorris.

In that case, I’d say don’t bother hacking the inits into the CmdStan comments and we can use Aki’s clever hack until this is addressed at the io/services layer. Does that mean the right place to open an issue about this is stan-dev/stan?