Rstan::read_stan_csv throwing error with cmdstan models (versions 2.35)

Hello, I recently updated my cmdstanr and rstan packages after the newly released version of Stan (both now say Stan version 2.35.0). My workflow has been to run a model using cmdstan and then using the rstan::read_stan_csv() function on the sampling output (the result of stan_model$sample) so that I could then use the rstan::extract() function to get an array of the posterior samples for my variables.

rstan::read_stan_csv(model_samples$output_files())

After updating, I now get the following error, which has appeared periodically on other updates of Stan and cmdstanr:

Error in if (max(save_warmup) == 0L) { :
missing value where TRUE/FALSE needed
In addition: Warning messages:
1: In parse_stancsv_comments(comments) : NAs introduced by coercion
2: In parse_stancsv_comments(comments) : NAs introduced by coercion
3: In parse_stancsv_comments(comments) : NAs introduced by coercion
4: In parse_stancsv_comments(comments) : NAs introduced by coercion

I had hoped updating rstan from the experimental branch would have resolved this but I still get the error after updating to 2.35.0 in rstan.

Is there a way to resolve this issue? Can I hard-code or tweak the text of a function to resolve the “max_warmup” variable?

My workflow has been to run a model using cmdstan and then using the rstan::read_stan_csv() function on the sampling output (the result of stan_model$sample) so that I could then use the rstan::extract() function to get an array of the posterior samples for my variables.

That seems like an unnecessarily complex path if you just want an array of posterior samples. Is there a reason to prefer this approach over just using cmdstanr to read the samples? For example:

fit$draws(format = "draws_array")

Or one of the many other draws formats

I recognize it’s complicated but stanfit objects give me arrays that are grouped by named variable (vectorized variable?). Not clear on the best nomenclature, but something where I could say posterior$x to get a whole table of x[1], x[2], x[3], … whereas fit$draws(format = “draws_array”) or fit$draws(format = “df”) gives me something where I would have to separately call posterior$x[1], posterior$x[2], posterior$x[3].

Most of my post-processing code is designed around navigating these arrays (e.g., grabbing the sample for a particular specimen or for specimens from a particular analytical grouping).

If there’s a straightforward way to recreate that behavior with a draws_array (or draws_df), I would happily do that! But from what I’ve been looking at, it’d take either a lot of overhauling or writing my own wrapper to recreate that (which is what I’m currently working on)

Yep that’s also trivial to do:

x <- fit$draws(variables="x")

Or:

draws <- fit$draws()

x <- posterior::subset_draws(draws, "x")
1 Like

Ahhh, thank you so much!