I’ve been running a model using cmdstan with 2 chains and 2000 iterations each. When running the summary comand (I’m running it ex post, once both chains are done):
cd $HOME/cmdstan-2.26.0/
./bin/stansummary ./jobfiles/model1/output_*.csv
I get the following error message:
Error during processing. add(stan_csv): number of columns in sample does not match chains
terminate called after throwing an instance of 'std::invalid_argument'
what(): add(stan_csv): number of columns in sample does not match chains
I’ve checked the two .csv output files and they have exactly the same structure (number of columns, rows etc.).
Can I now fix this ex post setting a different id in the .csv output files? How should I fix this when starting the estimation? I was wondering because other estimations with exactly the same structure didn’t throw that error message.
You can go and change the id in the CSV file and you will be able to read in the CSV files with stansummary. Before you do that, make sure the seeds are different though (line starting with "seed = "). If those are also the same then the results should also be the same.
If the seeds are different, you are most likely fine (that is, the pseudorandom generated numbers are not correlated), though I would suggest rerunning the experiments if this isnt a really long experiment. Run the chains with the same seed and different ids. That is the recommended way of using multiple chains. The actual seed used in sampling is the supplied seed with a stride based on the id values.
Example
./bernoulli sample data file=bernoulli.data.json output file=output_1.csv id = 1 random seed = 123 &
./bernoulli sample data file=bernoulli.data.json output file=output_2.csv id = 2 random seed = 123 &
Thanks for reporting this. The error message should be improved in this case as should the documentation example in 4 MCMC Sampling | CmdStan User’s Guide
Thank you. I will add the ids in the sampling call in the future. I ran this model for ~10days, so it wouId be quite helpful to get access to the summary now. I followed your recommendation and changed the id’s to 1 and 2 using Excel and now stansummary throws the following error message:
Warning: non-fatal error reading adaptation data
Error: expected 2731 columns, but found 2656 instead for row 3
Warning: non-fatal error reading samples
Error during processing. No sampling draws found in Stan CSV file: ./jobfiles/model1/output_1.csv.
Processing csv files: ./jobfiles/model1/output_1.csvWarning: non-fatal error reading adaptation data
Error: expected 2731 columns, but found 2656 instead for row 3
Warning: non-fatal error reading samples
, ./jobfiles/model1/output_2.csvWarning: non-fatal error reading adaptation data
Error: expected 2731 columns, but found 2655 instead for row 3
Warning: non-fatal error reading samples
terminate called after throwing an instance of 'std::invalid_argument'
what(): add(stan_csv): number of columns in sample does not match chains
/var/log/slurm/spool_slurmd//job15759076/slurm_script: line 18: 4421 Aborted (core dumped) ./bin/diagnose ./jobfiles/model1/output_*.csv
I have spotted the error. The two jobs have been using the same model, but different datasets. Sorry for taking your time, the id thing is still useful, thank you! :)