How can I change the way estimates are printed by fit$draws()

fsdias · February 21, 2022, 9:41am

Hi,

I’m working with the following model:

data{
int N;
vector[N] obs_times;
vector[N] t;
}
parameters{
real<lower=0> lambda;
}
model{
obs_times~exponential(lambda);
lambda~normal(0,5);
}
generated quantities{
  //Draw survival function
  vector[N] surv;
  for(i in 1:N){
    surv[i]=exp(-lambda*t[i]);
  }
}

which I run as follows:

library(cmdstanr)
dat<-list(N=100,obs_times=data$x,t=seq(0,1.5,length=100))
mod<-cmdstan_model("surv_exp.stan")
fit<-mod$sample(data=dat,
                chains=4,
                parallel_chains=4)

I would like to extract “surv” values to make a plot. However, I found that “surv” values are not printed in the right order:

> fit$draws("surv",format="list")
# A draws_list: 1000 iterations, 4 chains, and 100 variables

[chain = 1]
$`surv[1]`
 [1] 1 1 1 1 1 1 1 1 1 1

$`surv[2]`
 [1] 0.96 0.96 0.96 0.96 0.96 0.96 0.96 0.96 0.96 0.96

$`surv[3]`
 [1] 0.92 0.92 0.93 0.93 0.91 0.92 0.92 0.92 0.92 0.92

$`surv[4]`
 [1] 0.88 0.89 0.89 0.89 0.87 0.88 0.88 0.88 0.88 0.88

I would expect to the see the values ordered like this: 1, 0.96, 0.92, 0.88…

If I convert the cmdstanr object into an rstan object and then extract “surv” values, then the values are printed in the right order:

library(rstan)
stanfit <- rstan::read_stan_csv(fit$output_files())
post<-extract(stanfit)

post$surv[1:2,]
          
iterations [,1]     [,2]     [,3]     [,4]     [,5]     [,6]     [,7]     [,8]     [,9]
      [1,]    1 0.960775 0.923088 0.886879 0.852091 0.818668 0.786555 0.755702 0.726060
      [2,]    1 0.951013 0.904426 0.860121 0.817987 0.777916 0.739809 0.703568 0.669102
          
iterations    [,10]    [,11]    [,12]    [,13]    [,14]    [,15]    [,16]    [,17]    [,18]
      [1,] 0.697580 0.670217 0.643927 0.618669 0.594402 0.571086 0.548685 0.527163 0.506484
      [2,] 0.636325 0.605154 0.575509 0.547317 0.520505 0.495008 0.470759 0.447698 0.425767

How can I change the way the values are printed?
data_surv.csv (1.8 KB)

rok_cesnovar · February 21, 2022, 11:34am

Something similar to the extract rstan capability is available through the draws_rvars functionality of the posterior R package.

Example:

library(cmdstanr)

fit <- cmdstanr_example()

a <- fit$draws(format = "list")

b <- posterior::as_draws_rvars(a)

in the example there are 3 beta parameters:

> posterior::subset_draws(a, variable = "beta")
# A draws_array: 1000 iterations, 4 chains, and 3 variables
, , variable = beta[1]

         chain
iteration     1     2     3     4
        1 -0.17 -0.52 -0.75 -1.01
        2 -0.73 -0.74 -1.00 -0.78
        3 -0.45 -0.67 -1.13 -0.72
        4 -0.71 -0.37 -0.44 -0.53
        5 -0.65 -0.41 -1.00 -1.16

, , variable = beta[2]

         chain
iteration       1      2     3      4
        1 -0.2524 -0.093 -0.30 -0.513
        2 -0.5772 -0.369 -0.40 -0.359
        3 -0.3713 -0.298 -0.38 -0.376
        4 -0.4515 -0.269 -0.37 -0.083
        5  0.0029  0.094 -0.29 -0.273

, , variable = beta[3]

         chain
iteration    1    2    3    4
        1 0.40 0.50 0.62 1.15
        2 1.20 0.61 0.32 0.85
        3 0.96 0.31 0.72 0.62
        4 0.92 0.79 1.02 0.96
        5 0.66 0.43 0.63 0.35

# ... with 995 more iterations

> b$beta
rvar<1000,4>[3] mean ± sd:
[1] -0.67 ± 0.25  -0.28 ± 0.23   0.68 ± 0.27

Also see rvar: The Random Variable Datatype • posterior

fsdias · February 21, 2022, 8:01pm

I tried it:

post<-fit$draws(“surv”,format=“list”)
a ← posterior::as_draws_rvars(post)
posterior::as_draws_rvars(a)

A draws_rvars: 1000 iterations, 4 chains, and 1 variables
$surv: rvar<1000,4>[100] mean ± sd:
[1] 1.000 ± 0.0000 0.955 ± 0.0044 0.913 ± 0.0084 0.872 ± 0.0120 0.833 ± 0.0152
[6] 0.796 ± 0.0182 0.760 ± 0.0209 0.726 ± 0.0232 0.694 ± 0.0254 0.663 ± 0.0272
[11] 0.633 ± 0.0289 0.605 ± 0.0304 0.578 ± 0.0317 0.552 ± 0.0327 0.528 ± 0.0337
[16] 0.504 ± 0.0345 0.482 ± 0.0351 0.460 ± 0.0357 0.440 ± 0.0361 0.420 ± 0.0364
[21] 0.402 ± 0.0366 0.384 ± 0.0367 0.367 ± 0.0367 0.351 ± 0.0367 0.335 ± 0.0366

but I couldn’t find a way of not getting the summarized estimates (ie 0.955 ± 0.0044). For my workflow I would need to be able to access individual “surv” objects like this:

$surv[2]
[1] 0.96 0.96 0.96 0.96 0.96 0.96 0.96 0.96 0.96 0.96

$surv[3]
[1] 0.92 0.92 0.93 0.93 0.91 0.92 0.92 0.92 0.92 0.92

$surv[4]
[1] 0.88 0.89 0.89 0.89 0.87 0.88 0.88 0.88 0.88 0.88

rok_cesnovar · February 21, 2022, 8:15pm

I believe

posterior::draws_of(a$surv[2])

returns you that.

fsdias · February 21, 2022, 9:14pm

post<-fit$draws(format="list")
a<-posterior::as_draws_rvars(post)
posterior::draws_of(a$surv[2])

Returns:

2    0.947864
3    0.956244
4    0.950469
5    0.957255
6    0.951365

The output is still written out of order. Did I write the model in a way that is not expected?

ahartikainen · February 22, 2022, 4:54am

Quick question. Do you want ‘all draws for one surv’ item or ‘one draw’ with all surv items?

fsdias · February 22, 2022, 8:32am

Each surv[i] contains values from a survival curve. I would like to obtain the values in the following order:

0.960775 0.923088 0.886879 0.852091 0.818668 0.786555 0.755702 0.726060

Right now cmdstanr is printing the values like this:

$surv[2]
[1] 0.96 0.96 0.96 0.96 0.96 0.96 0.96 0.96 0.96 0.96

$surv[3]
[1] 0.92 0.92 0.93 0.93 0.91 0.92 0.92 0.92 0.92 0.92

$surv[4]
[1] 0.88 0.89 0.89 0.89 0.87 0.88 0.88 0.88 0.88 0.88

So I think the right answer is “one draw with all surv items”.

avehtari · February 22, 2022, 8:41am

This case study has examples of obtaining expectation of the curve or draws from the posterior of the curve Gaussian process demonstration with Stan. It seems you are looking for something similar. See also Using tidybayes with the posterior package • tidybayes for approach combined with tidybayes plotting.

rok_cesnovar · February 22, 2022, 11:58am

I wrote this Add rstan::extract()-like feature to fit$draws() · Issue #632 · stan-dev/cmdstanr · GitHub, but pasting it here as well so its all in once place:

library(cmdstanr)

fit <- cmdstanr_example()

library(rstan)
stanfit <- rstan::read_stan_csv(fit$output_files())
post<-extract(stanfit)

post$beta[1:2,]

will print:

> post$beta[1:2,]
          
iterations      [,1]      [,2]     [,3]
      [1,] -0.314607 -0.345438 0.913696
      [2,] -0.630833 -0.188254 0.426538

You can get the exact same thing by running:

a <- posterior::as_draws_rvars(fit$draws())

posterior::draws_of(a$beta)[1:2,]

and get

> posterior::draws_of(a$beta)[1:2,]
       [,1]      [,2]    [,3]
1 -0.508033 -0.731428 1.07239
2 -0.544134 -0.641063 1.05044

The only difference is that rstan::extract permutes the samples (I don’t have enough background in rstan to know why is that so, I am sure there is a valid reason). So running

library(rstan)
stanfit <- rstan::read_stan_csv(fit$output_files())
post<-extract(stanfit)

post$beta[1:2,]

will print different results every time for the same CSV files.

avehtari · February 22, 2022, 3:10pm

Historical reasons, making a wrong solution to a valid problem, and it has been much regretted choice for a long time. It’s just better to use posterior package for everything and if it doesn’t do something you like, ask if that could be added there. There are already a lot of useful functions for working with dras that are easier than trying to figure out indexing of the draws object directly Function reference • posterior

fsdias · February 23, 2022, 9:49am

Thanks @rok_cesnovar @avehtari . This piece of code answers my question:

a <- posterior::as_draws_rvars(fit$draws())
posterior::draws_of(a$surv)[1:2,]

Topic		Replies	Views
Naming help on function in CmdStanR and CmdStanPy to get variable from sample Interfaces posterior-package	25	2135	June 9, 2020
Replacement for permuted=TRUE (RStan 3 / PyStan 3) Developers	38	2466	July 18, 2017
[Interface roadmap] fit objects and `extract` Developers	44	2474	September 17, 2019
How to get draws from cmdstanpy.stanfit.CmdStanMCMC? CmdStan cmdstanpy	5	1054	April 18, 2021
Posterior draws objects => recover original array data structure? Interfaces	13	1016	August 20, 2021

How can I change the way estimates are printed by fit$draws()

Related topics