Import csv output from cmdstan in R: How to indicate the chain?

Hi everyone,

I ran 2 chains with cmdstan and it has generated two output files, output_1.csv and output_2.csv.

I imported these files in R, using:

files <- c("output_1.csv", "output_2.csv")
out = read_stan_csv(files)

When I use the traceplot function, it only shows one chain. Obviously, it puts all iterations together in one chain:

rstan::traceplot(out, pars = c("sigma_alphaj", "sigma_alphai"))

grafik

How can I make sure R identifies each output file as one chain?

I am not sure how you can do that with rstan’s functions but this work for cmdstan output and traceplots:

# install.packages("remotes")
remotes::install_github("stan-dev/cmdstanr")

library(cmdstanr)
library(bayesplot)

fit <- as_cmdstan_fit(files)

color_scheme_set("mix-blue-pink")
mcmc_trace(fit$draws(variables = c("alpha", "beta[1]")))

1 Like

Thanks.
After running
fit <- as_cmdstan_fit(files)

it tells me
Error: Supplied CSV file is corrupt!

read_stan_csv(files) worked.

Any ideas?

That is weird.

Does read_cmdstan_csv(files) also report the same error?

Can you try with just a single file?

read_cmdstan_csv(files[1])

read_cmdstan_csv(files) works without any problems when loading the files. But when using
mcmc_trace(fit$draws(variables = c("alpha", "beta[1]"))), it tells me $ operator not defined for this S4 class.

With both files[1] and files[2], the following tells me the file is corrupt, this is the complete message in German language:

fit <- as_cmdstan_fit(files[2])
Der Befehl "'[#a-zA-Z]'" ist entweder falsch geschrieben oder
konnte nicht gefunden werden.
Fehler: Supplied CSV file is corrupt!

Oh, if read_cmdstan_csv files work then you can do:

draws <- read_cmdstan_csv(files = files, variables = c("sigma_alphaj", "sigma_alphai"))

color_scheme_set("mix-blue-pink")
mcmc_trace(draws$post_warmup_draws)

Not sure why as_cmdstan_fit would not work though.

1 Like

Sorry, what works is read_stan_csv but not read_cmdstan_csv. Unfortunately, so far I have no solution…

Thanks for trying to make this work!

Oh, I get it thanks!

Can you run

cmdstanr::check_cmdstan_toolchain(fix = TRUE)

and then try the first approach above:

fit <- as_cmdstan_fit(files)

color_scheme_set("mix-blue-pink")
mcmc_trace(fit$draws(variables = c("alpha", "beta[1]")))
1 Like

It now creates the plots, but for any reason it still does not show a second chain, correct?

grafik

Any chance the two chains are completely equal? Did you use the exact same seed?

2 Likes

I think we’re getting closer, this seems to be the case. Sorry that we’ve been working on the wrong thing.

How can I change this part of the Batch script to to make sure it’s not using the same seed?

for i in {1..2}
do
./logistic_model sample algorithm=hmc engine=nuts max_depth=10 num_samples=2000 data file=log_data.R output file=output_${i}.csv &
done

wait

Thank you so much!

You need to add “random seed=$i” for example.

1 Like

Very good, that was very helpful. I marked the solution that answers my initial question. Thanks for your support!

1 Like