# Summarize draws from specific chains

Hi,

When summarizing model output (e.g., using “print” on an rstan object), is there a way to only summarize draws from specific chains / exclude specific chains from the summary & calculation of n_eff and r-hat?

1 Like

I don’t think it’s natively supported in `rstan`, but you can do it via the `subset_draws` function in the `posterior` package: https://github.com/stan-dev/posterior

``````library(posterior)

# Convert rstan object to array of samples
fit_draws = as_draws(fit)

# Extract first two chains
draws_12 = subset_draws(fit_draws,chain=1:2)

# Summarise first two chains
summarise_draws(draws_12)
``````

Or in one line:

``````summarise_draws(subset_draws(as_draws(fit),chain=1:2))
``````
4 Likes

@jonah Is this the most efficient way of summarising a subset of chains from a `stanfit` object with `posterior`? Still getting used to the package

1 Like

Thanks!

Is there a way to quickly extract specific parameters? e.g. in rstan can simply specify “pars” argument

Yep, using the `variable = ` option of the `subset_draws` command:

``````library(posterior)

# Convert rstan object to array of samples
fit_draws = as_draws(fit)

# Extract first two chains, and only the "mu" & "tau" parameters
draws_12 = subset_draws(fit_draws,chain=1:2,variable=c("mu","tau"))

# Summarise first two chains
summarise_draws(draws_12)
``````
3 Likes

Sorry for the slow reply! Yeah I think using `subset_draws()` is the way to go. Alternatively you could first just get the subset of chains from rstan using `as.array(fit)[,1:2,]` and then pass that to posterior, which saves you having to call `subset_draws()`. But `subset_draws()` shouldn’t be super inefficient or anything.

3 Likes

I am trying to focus on a subset of chains, as the OP was doing, and I am getting this error message:

``````> stan_mcmc_fpp_123 <- subset_draws( as_draws(stan_data_fpp), chain=1:3)
Error: All variables in all chains must have the same length.
``````

(Chain 4 did not get anywhere close to mixing with others, it’s just a straight line on traceplots, and judging from the slope, it would take it at least 20,000 draws to get there; all draws were divergent after the warmup, so maybe `rstan::sampling()` did not really save that much. The successful chains took ~4 hours each to run, so at the moment I kinda want to make do with chains 1:3.)

I’d second an opinion that it is worth adding `chains=numeric_vector` to the methods like `stanfit.summary`, but I understand it is a lot of refactoring.