Lightweight interfaces - keeping it light

rok_cesnovar · July 31, 2020, 7:01am

@maedoc is referring to the vroom R package we are using in cmdstanr to read the CSV files with samples as opposed to utils::read.csv rstan is using in read_stan_csv.

We use it because its faster in general and because it allows reading in only selected columns of the CSV to not waste memory (and reading only some columns is faster.

At the time of the PR we ran tests (develop = utils::read.csv, PR = vroom)

branch \ num of param	19	643	1283	1923	2563	3203
develop	0.3489878	5.9398425	10.4866965	16.3929579	21.5311923	27.2760901
PR	0.2396772	1.7050161	3.2770655	4.4713771	6.0561974	6.7868321
PR - read 50% of parameters	0.1603785	1.1780379	2.0339272	2.9903495	3.5180790	4.1634433
PR - read 2 columns (validation)	0.1399992	0.3635323	0.6058638	0.8312213	1.1903284	1.3976476
PR - read 1 parameter	0.1341414	0.3551099	0.6020284	0.8367260	1.2255943	1.3311992

2000 samples per parameter in all cases. The unit is seconds.

Alternative packages in R that are as fast or in some cases faster than vroom are readr and fread. However, those two struggle with the format of the Stan CSV. Both struggle with the comments inside the CSV table (where we print the step size and inv metric after adaptation ends). The metadata before the column names or the timing printed after the samples is not problematic. vroom by default also has problems with that, but has some options with which we can get around this.

column names
# adaptiation ended
# stepsizse
# inv metric
1
2
3
4

vroom also enables reading the CSV lazily, but we are not using that ATM as you cant delete/move the CSV files when reading in lazily. It requires a R session restart to do that.

Topic		Replies	Views
CmdStanPy 1.0 Interfaces	3	1349	December 6, 2021
Beta release of the CmdStanR interface Announcements	0	509	July 29, 2020
CmdStanPy - ready for beta testing! Developers pystan	23	2173	August 6, 2019
Cmdstan 2.24.1 is released Announcements cmdstan	16	1562	August 22, 2020
Most stable Stan interface(s) Interfaces	2	802	March 27, 2020

Lightweight interfaces - keeping it light

Related topics