Hi,
When summarizing model output (e.g., using “print” on an rstan object), is there a way to only summarize draws from specific chains / exclude specific chains from the summary & calculation of n_eff and r-hat?
Hi,
When summarizing model output (e.g., using “print” on an rstan object), is there a way to only summarize draws from specific chains / exclude specific chains from the summary & calculation of n_eff and r-hat?
I don’t think it’s natively supported in rstan
, but you can do it via the subset_draws
function in the posterior
package: https://github.com/stan-dev/posterior
library(posterior)
# Convert rstan object to array of samples
fit_draws = as_draws(fit)
# Extract first two chains
draws_12 = subset_draws(fit_draws,chain=1:2)
# Summarise first two chains
summarise_draws(draws_12)
Or in one line:
summarise_draws(subset_draws(as_draws(fit),chain=1:2))
@jonah Is this the most efficient way of summarising a subset of chains from a stanfit
object with posterior
? Still getting used to the package
Thanks!
Is there a way to quickly extract specific parameters? e.g. in rstan can simply specify “pars” argument
Yep, using the variable =
option of the subset_draws
command:
library(posterior)
# Convert rstan object to array of samples
fit_draws = as_draws(fit)
# Extract first two chains, and only the "mu" & "tau" parameters
draws_12 = subset_draws(fit_draws,chain=1:2,variable=c("mu","tau"))
# Summarise first two chains
summarise_draws(draws_12)
Sorry for the slow reply! Yeah I think using subset_draws()
is the way to go. Alternatively you could first just get the subset of chains from rstan using as.array(fit)[,1:2,]
and then pass that to posterior, which saves you having to call subset_draws()
. But subset_draws()
shouldn’t be super inefficient or anything.
I am trying to focus on a subset of chains, as the OP was doing, and I am getting this error message:
> stan_mcmc_fpp_123 <- subset_draws( as_draws(stan_data_fpp), chain=1:3)
Error: All variables in all chains must have the same length.
(Chain 4 did not get anywhere close to mixing with others, it’s just a straight line on traceplots, and judging from the slope, it would take it at least 20,000 draws to get there; all draws were divergent after the warmup, so maybe rstan::sampling()
did not really save that much. The successful chains took ~4 hours each to run, so at the moment I kinda want to make do with chains 1:3.)
I’d second an opinion that it is worth adding chains=numeric_vector
to the methods like stanfit.summary
, but I understand it is a lot of refactoring.