Hallo everybody.

I noticed that for big models cmdstanr seemed a lot slower in summarizing than rstan, so I run a test on a dummy model with `N=100`

, 4 chains and 4000 warmup and 4000 sampling iterations and saved the result in .csv files.

```
data {
int<lower=0> N;
}
parameters {
vector[N] big_mat[N];
}
model {
for (n in 1:N)
big_mat[n] ~ std_normal();
}
```

It seems that the `posterior::summarise_draws`

function is very slow.

I ran the following comparisons:

```
## Read-ins
> system.time({stanfit_by_csvs <- read_stan_csv(grep("\\.csv", dir(), value = TRUE))})
user system elapsed
70.012 0.583 70.583
> system.time({cmdstan_fit_by_csvs <- as_cmdstan_fit(grep("\\.csv", dir(), value = TRUE))})
user system elapsed
14.876 3.111 15.497
> system.time({read_cmdstan_csv_input <- read_cmdstan_csv(grep("\\.csv", dir(), value = TRUE))})
user system elapsed
15.340 2.615 14.551
```

Rstan is slower to read in the files and create a fit object, but for the summary itâ€™s much faster:

```
## Summaries
> system.time({rstan_summary <- rstan::summary(stanfit_by_csvs)})
user system elapsed
56.724 0.134 58.782
> system.time({cmdstanr_summary <- cmdstan_fit_by_csvs$summary()})
user system elapsed
232.343 0.158 232.512
> system.time({posterior_summarise_draws <- posterior::summarise_draws(read_cmdstan_csv_input$post_warmup_draws)})
user system elapsed
231.480 0.064 231.503
```

I made a â€śmanualâ€ť version and itâ€™s a bit faster:

```
> system.time({manual_summary <- posterior::as_draws_df(read_cmdstan_csv_input$post_warmup_draws) %>%
+ pivot_longer(-c(.iteration, .chain, .draw), names_to = "variable") %>%
+ group_by(variable) %>%
+ summarize(
+ across(
+ value,
+ list(
+ mean = mean,
+ median = median,
+ sd = sd,
+ mad = mad,
+ q5 = function(x) quantile(x, 0.05),
+ q95 = function(x) quantile(x, 0.95),
+ rhat = posterior::rhat,
+ ess_bulk = posterior::ess_bulk,
+ ess_tail = posterior::ess_tail
+ ),
+ .names = "{.fn}"
+ )
+ )})
user system elapsed
159.625 1.732 161.333
```

So Iâ€™m wondering:

- Whatâ€™s going on?
- How can rstan be so much faster?
- Also: When rstan calculates the summaries, it (to me) magically stores the result somewhere and all subsequent calculations (including diagnostics) are very fast. How does that work and why canâ€™t cmdstanr do that?

- Is there a way to speed up the manual cmdstanr version (for example when I donâ€™t want to work with rstan for some reason)?
- Basically, what is the fastest way to get the summary statistics for a set of parameters of interest AND ALSO the sampler diagnostics?

**One more strange thing:**

All versions seem to differ in the calculated ESS (I didnâ€™t put the `cmdstanr_summary`

version here since itâ€™s the same as the `posterior_summarise_draws`

one). What is going on there??

```
> manual_summary %>% arrange(match(variable, posterior_summarise_draws$variable)) %>% select(variable, ess_bulk, ess_tail) %>% print(n = 5)
# A tibble: 10,001 Ă— 3
variable ess_bulk ess_tail
<chr> <dbl> <dbl>
1 lp__ 4523. 6340.
2 big_mat[1,1] 38982. 10783.
3 big_mat[2,1] 37777. 10699.
4 big_mat[3,1] 36204. 10589.
5 big_mat[4,1] 36468. 10457.
# â€¦ with 9,996 more rows
> posterior_summarise_draws %>% select(variable, ess_bulk, ess_tail) %>% print(n = 5)
# A tibble: 10,001 Ă— 3
variable ess_bulk ess_tail
<chr> <dbl> <dbl>
1 lp__ 4534. 6347.
2 big_mat[1,1] 39649. 10829.
3 big_mat[2,1] 38682. 10710.
4 big_mat[3,1] 36529. 10628.
5 big_mat[4,1] 36782. 10448.
# â€¦ with 9,996 more rows
> rstan_summary$summary %>% as_tibble(rownames = "variable") %>% arrange(match(variable, posterior_summarise_draws$variable)) %>% select(variable, n_eff) %>% print(n = 5)
# A tibble: 10,001 Ă— 2
variable n_eff
<chr> <dbl>
1 lp__ 4516.
2 big_mat[1,1] 39265.
3 big_mat[2,1] 38295.
4 big_mat[3,1] 36844.
5 big_mat[4,1] 36501.
# â€¦ with 9,996 more rows
```