Cmdstan 2.24 release candidate now available

the stan rc is essentially defined by the stan git hash tag which got tagged by cmdstan. So it’s implicitly there.

Yes, pystan 3/httpstan would use a tagged release. In our build process
we download a release (just like one downloads a stanc binary).

To give you a sense of how we use the tagged releases, here are two
lines from our Makefile:

curl --silent --location
https://github.com/stan-dev/stan/archive/v$(STAN_VERSION).tar.gz -o $@

curl --silent --location
https://github.com/stan-dev/math/archive/v$(MATH_VERSION).tar.gz -o $@

Ok, let me know if you would already benefit from having a Stan & Math RC now and would use it this week. If so, I can ask @serban-nicusor to build when he has the time. Otherwise I just made a note to build a RC for Stan & Math for the next release cycle.

1 Like

@rok_cesnovar @jonah how do you feel about -Wno-ignored-attributes in cmdstanr until the Eigen issue goes away?

I guess so far it’s only me suffering from that issue, but man, there is soooo much output that prints to the point that it bogs down my computer.

2 Likes

You are far from the only one suffering form that output. Its annoying to me as well.

I am going to open a PR in math to make that flag be the default flag. Its a problem for anyone with g++ 8 or recent clang.

Just need to double check that it doesnt cause a problem with the old compilers that might not have that flag (they probably do have it, just to double check). Will spin up an AWS instance to install the old compilers.

3 Likes

Thanks! Will the docs also update the current/old HMM model example to show how to use the new functions?

Yack… it looks like we have a performance regression bug in 2.24. I just ran the model from my OncoBayes2 package with 4 chains and 5000 warmup and 5000 samples. The 2.23 version is a lot faster - like 20%. See

> old_time
   user  system elapsed 
173.423   0.812 174.849 
> new_time
   user  system elapsed 
212.229   1.042 214.151 
> 173 / 212
[1] 0.8160377
> print(fit_new, pars="lp__")
Inference for Stan model: chain-1-2342.
4 chains, each with iter=10000; warmup=5000; thin=1; 
post-warmup draws per chain=5000, total post-warmup draws=20000.

       mean se_mean   sd  2.5%    25%    50%    75%  97.5% n_eff Rhat
lp__ -71.92    0.08 7.19 -87.1 -76.51 -71.61 -66.86 -58.82  7740    1

Samples were drawn using NUTS(diag_e) at Thu Jul 23 16:19:14 2020.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at 
convergence, Rhat=1).
> print(fit_old, pars="lp__")
Inference for Stan model: chain-1-2342.
4 chains, each with iter=10000; warmup=5000; thin=1; 
post-warmup draws per chain=5000, total post-warmup draws=20000.

      mean se_mean   sd  2.5%    25%    50%    75%  97.5% n_eff Rhat
lp__ -71.8    0.08 7.17 -86.8 -76.42 -71.51 -66.79 -58.75  8271    1

Samples were drawn using NUTS(diag_e) at Thu Jul 23 16:22:30 2020.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at 
convergence, Rhat=1).
> 

The R snippet for this:

library(OncoBayes2)
library(rstan)

source("/Users/weberse2/work/stan_pkpd/utils/tools.R")

example("example-combo3")

names(blrmfit)

cat(get_stancode(blrmfit$stanfit), "\n", file="blrm_exnex-24.stan")
cat(get_stancode(blrmfit$stanfit), "\n", file="blrm_exnex-23.stan")



args(cmdstan)

options(cmdstan_home="/Users/weberse2/work/cmdstan-v2.24.0-rc1")

samples <- 5000
warmup <- 5000

new_time <- system.time(fit_new  <- cmdstan("blrm_exnex-24.stan", data=blrmfit$standata, num_warmup=warmup, num_samples=samples, seed=2342, cores=1, chains=4))
new_time

##54s

old_time <- system.time(fit_old  <- cmdstan("blrm_exnex-23.stan", data=blrmfit$standata, num_warmup=warmup, num_samples=samples, seed=2342, cores=1, chains=4, cmdstan="/Users/weberse2/work/cmdstan-2.23.0"))
## 35s

old_time
new_time

print(fit_new, pars="lp__")
print(fit_old, pars="lp__")

The tools.R … tools.R (9.3 KB)

I’m guessing ODEs are the most likely culprit?

No. This is a varying intercept, varying slope model for a logistic regression as a mixture model. No ODE stuff at all.

Phew. Alright I can join the debugging effort on this one after Stan meeting n’ such.

Yes, this one should be resolved and I am curious where this one is now hidden…

For completeness here is my make/local:

CXX=clang++
CC=clang
STAN_THREADS=true
CXXFLAGS+=-march=native -mtune=native -ftemplate-depth-256
CXXFLAGS+=-DBOOST_MATH_PROMOTE_DOUBLE_POLICY=false
CXXFLAGS+=-Wno-unused-variable -Wno-unused-function -Wno-unused-local-typedefs

and this is all on macOS Catalina.

Will try to run this with my bisect script. Hopefully it finds something specific. Nothing jumps out for now.

1 Like

Were you able to reproduce my finding with the example given?

Not yet, now starting to look at this.

@wds15 what is blrmfit in this example? I am bit lost trying to replicate. I have oncobayes installed. Maybe we can start another thread for this?

I’m not sure, but that would definitely be preferable. @charlesm93 is that part of the plan? (Also @charlesm93 the HMM functionality looks awesome!)

1 Like

@rok_cesnovar here the model and data: blrm.stan (28.4 KB) blrm.data.R (3.9 KB)

1 Like

@rok_cesnovar if I swap a 2.23 compiler into a develop cmdstan, I get much faster runs.

With four runs of develop, I got 4.8s, 4.7s, 6.1s, 4.8s

With 2.23 I got: 4.0s, 4.2s, 4.2s, 4.0s

With develop cmdstan with a 2.23 stanc3 I got 4.3s, 4.3s, 4.4s, 4.3s.

With develop cmdstan with a 2.23 stanc3 and a 2.23 math I got: 4.2s, 4.2s, 4.1s, 4.1s

So I think there’s been a bit of regression in the compiler and Math.

I’m gonna do a little manual bisecting on Math to see if I can figure that bit out.

2 Likes

@rok_cesnovar I think the math pull slowdown is from the elementwise checks (here). What do we do here? I think the code is good, but clearly it’s slowing things down as it’s implemented now. Do we just revert it and make it an open pull request again?

I wanted this for the variadic ODEs (better error messages), but we could remove it (there are checks in the variadic ODEs pull, so all the same errors get caught, just not as cleanly).

Here are the relevant benchmarks:

July 13th (before): a3f438bd4916bc7afee3e3c147e8320b09cef386

4.0s, 4.0s, 4.0s, 4.0s

July 14th (after): e5f00e2e053da20400f729f1bc7198ca3b3e8955

4.4s, 4.3s, 4.4s, 4.2s

Edit: Discourse wouldn’t let me post three times in a row, but I did some stanc3 tests:

I suspect the stanc3 thing is this pull: https://github.com/stan-dev/stanc3/pull/521

I went through the stanc3 binaries that compiled here and tried them out with a develop cmdstan with 2.23 math: https://jenkins.mc-stan.org/job/stanc3-test-binaries/

These are the relevant numbers:

May 8th, #83

4.0s, 4.0s, 4.0s, 4.1s

May 11th, #85

4.3s, 4.1s, 4.1s, 4.4s

May 16th, #86

4.5s, 4.5s, 6.2s, 4.5s

3 Likes

Yes.

1 Like