Yeah I can do that, thanks.
Here are the final results for the speedups that we are able to achieve with the 2.19 GPU code. I tested it with the Titan XP (high-end GPU for 1500$ that NVIDIA also gives out frequently to researchers as Bob mentioned here) , Tesla V100 (the most high-end GPU currently, costs around 8k$ or ~2$/h on AWS) and a R9 Fury (3 year old 300$ GPU).
For tthe primitive Cholesky decompose:
For the Cholesky decompose on a matrix of var-s (basically primitive + value_of_rec):
For the gradient (the chain() function):
Thanks to @stevebronder, @seantalts, @syclik , @Erik_Strumbelj and everyone else involved for all the help on the way to 2.19 :) Looking forward to 2.20 with even more exposed functions for the Stan users.
Starting now, please don’t merge anything for a little while :)
@ariddell @bgoodri Stan 2.19 is now tagged. (Still releasing CmdStan and doing issue cleanup).
tagging @mitzimorris for the Stan docs release.
In the future I plan to write a script to generate most of the release notes by taking the names & numbers of the pull requests, along with the branch name (bugfix
vs feature
), so let’s try for good hygiene here.
Also, please feel free to continue merging etc as usual.
generated 2.19 docs and updated links to docs here:
https://mc-stan.org/users/documentation/
online docs and pdfs:
cheers,
Mitzi
Would you mind uploading these to the release and adding the HTML links there? Thanks!
uploaded tgz and zip versions of each manual to the release directory.
added links to online versions as well.
Is it possible to get this VI diagnostic PR in https://github.com/stan-dev/stan/pull/2618 ?
The PR has been approved by @syclik, but not yet merged.
It appears it didn’t make it into the 2.19 release after all?
It’s in Releases · stan-dev/stan · GitHub, but it just enables work to be done in interfaces.
Thanks. So it creates additional output in CmdStan, which one can use to implement the two diagnostics from “Yes, but Did It Work?: Evaluating Variational Inference”, right (potentially in combination with the PSIS implementation provided by loo
)?
I didn’t see any changes in CmdStan related to this, so perhaps it won’t be in that interface in 2.19, but it sounds like Aki et al are working to put it into RStan for 2.19?
PS @mitzimorris and @Bob_Carpenter, I have been looking up a lot of things in the Stan docs as part of writing code gen and for the first time I have been able to google what I’m looking for and get links to our docs! This makes me so happy and really helps speed up my process, so thank you both for your work on that!
Yes.
CmdStan didn’t need any changes. Change in Stan adds two columns (log_p__ and log_g__) to csv created by CmdStan when variational algorithm is used. This may affect scripts people are using to read csv, if they assume fixed order of columns.
RStan uses that same csv when using variational algorithm. RStan did assume fixed order of columns in that csv, and RStan 2.19 has a fix so that any column with name having trailing __
except lp__
is moved to diagnostics slot. That diagnostic slot is then available for users, and there is an experimental branch which is not going to be part RStan 2.19, which also shows the diagnostics. There needs to be a bit more planning for data structures and where the computation happens, but it’s now easier to make experiments which help to understand requirements for the design.
Thanks Aki. About the question where the computation need to happen, may I ask what this exactly means? Is that also applicable to CmdStan, when you say that the RStan is using the same csv? Or are you referring to calculations required to evaluate the diagnostics, like PSIS?
Do you guys have an anticipated release date for RStan 2.19?
Thank you for the documents!
It seems that the description about integrate_1d has been deleted.
Is there any plan to describe it again? I want to try it.
moved, not deleted - a while back it seemed like a good idea to put the Stan docs into a separate repo: https://github.com/stan-dev/docs
integrate_1d is described in the Stan Reference Manual:
Thanks!
@avehtari - Is there any (approximate) timeline for RStan to upgrade? (Apologies if I’m being pushy - I’m just trying to plan when I can re-start a project that’s waiting on this.)
agreed - there’s a mountain of technical debt around the csv and the output from the services. I just spent the weekend code-spelunking through the cmdstan utilities for different reasons - in the end I gave up on what I was trying to do and will find workarounds. I think this will run into similar problems.
(earlier typo, now corrected: s/workabounds/workarounds/ - to paraphrase Siggie, there are no typos.)
I don’t have an exact answer. Stan 2.19 adds log_p__ and log_g__ when using advi. These appear in CmdStan 2.19 produced csv. These will appar in RStan 2.19 stanfit object in a slot named diagnostics. That’s all in 2.19. log_p__ and log_g__ can be used to compute Pareto k diagnostic. You can compute that diagnostic yourself both in case of CmdStan 2.19 and RStan 2.19. Maybe in a future release Pareto k diagnostic is added to Stan services in C++, so that summary function for CmdStan can compute that diagnostic. RStan can use Pareto k diagnostic from loo package, as RStan is relying on loo package anyway, but maybe that computation could also somewhere else. I don’t know.
I don’t know. @bgoodri might know, but there is likely to be uncertainty due to the complexity of RStan releases.
This week