Recently I realized that our adaptation target, which is a heuristic proxy meant to mimic a Metropolis acceptance probability, might be too conservative, ultimately giving a step size that’s smaller than necessary which itself results in lower effective sample sizes and more expensive numerical integrations.
I’ve implemented what I believe is a more appropriate adaptation target in stan-dev/stan:feature/updated_stepsize_adapt_target and preliminary results confirm the theoretical intuition, but before making a pull request I want to see how robust the performance is.
Hence I’m reaching out to the community in case you have a second to give it a try on a sophisticated model that you might happen to have lying around. I’m hoping to get the summaries for the adaptation parameters and the effective sample size and effective sample size per time for current develop and this new branch.
For example in CmdStan the comparison might look like
git clone git@github.com:stan-dev/cmdstan.git (if you has SSH set up on GitHub)
git clone https://github.com/stan-dev/cmdstan (otherwise)
cd cmdstan
git submodule update --init --recursive
make build
make CC=clang++ -j4 O=3 <MODEL_NAME>
./<MODEL_NAME> <cmdstan options>
# Compute sampler parameter summaries
../../../bin/stansummary output.csv | awk 'NR > 5 && NR < 15 {print $0}'
# Compute sum of effective sample sizes
../../../bin/stansummary output.csv | awk 'NR > 14 && NR < 115 {sum += $8} END {print sum}'
# Compute sum of effective sample sizes per time
../../../bin/stansummary output.csv | awk 'NR > 14 && NR < 115 {sum += $9} END {print sum}'
cd stan
git checkout feature/updated_stepsize_adapt_target
cd ..
make clean-all
make build
make CC=clang++ -j4 O=3 <MODEL_NAME>
./<MODEL_NAME> <cmdstan options>
# Compute sampler parameter summaries
../../../bin/stansummary output.csv | awk 'NR > 5 && NR < 15 {print $0}'
# Compute sum of effective sample sizes
../../../bin/stansummary output.csv | awk 'NR > 14 && NR < 115 {sum += $8} END {print sum}'
# Compute sum of effective sample sizes per time
../../../bin/stansummary output.csv | awk 'NR > 14 && NR < 115 {sum += $9} END {print sum}'
but comparisons run in any interface are welcome provided the above information can be communicated.
Thanks to any volunteers to are able to produce comparisons and those who make an attempt!
I tried to reproduce the steps but… the first command gives me:
git clone git@github.com:stan-dev/cmdstan.git
Cloning into 'cmdstan'...
git@github.com: Permission denied (publickey).
fatal: Could not read from remote repository.
Please make sure you have the correct access rights
and the repository exists.
I’d like to know if there’s something principled that was done, like a diagnostic that was changed, a method, or a mathematical explanation. With mathematics, code examples, etc. Eliminate all ambiguity.
Sorry the example presumed that you have SSH keys set up with GitHub. As @aornugent notes you can clone via HTTPS via the command git clone https://github.com/stan-dev/cmdstan.
In theory the performance should be uniformly better but I want more empirical evidence before making any practical claims, hence the request. The intuition is a bit subtle but if the empirical evidence corroborates then I’ll write something about it up in the PR.
You have to first change directories into the new cmdstan directory. I updated the instructions, although keep in mind that the instructions are only meant as a guideline and one aimed at those who have used CmdStan before.
The dashes autoformatted into one during my copy and pasting. Sigh.
To run your model you’ll have to have your data handy in a data.R file and know how to configure the cmdstan executable – if that ends up giving you trouble then don’t worry about trying to force your way through. Thanks for the attempt regardless!
Yerp - got things loading. I used to be able to do this so I should be able to remember it. I have a look for interesting model to try amongst my past dabblings.
Hi again so I got it to run - although I’ve had trouble with the second and third summary commands above which just give me nan. Nevertheless, here is what the first summary command gives:
The model was a linear multivariate normal model with two Y variables and run on simulated data. In case it is useful I attach both the model linear_mvnormal.stan (1.7 KB) and the data file data.R (72.9 KB). I can try this again in real data if you wish. Note run1 took about 43 seconds to run, while run 2 took about twice that. I’d share the output file but they are huge because I forgot to remove the posterior predictions 🤦♂️
Thanks! Try chaining the 115 in the second and third commands to 15 + N_{\text{params}} where N_{\text{params}} is the total number of parameters (not including transformed parameters or generated quantities).
The problem is the components of the Cholesky factor that are fixed and return ill-defined effective sample sizes in CmdStan. Can you just grab the first 20 lines or so from the stansummary output that follows the sampler diagnostic summaries that you already shared? Thanks.
Oh yeah that makes sense. Sure see attached txtsummary_run1.txt (4.1 KB) summary_run2.txt (4.3 KB) files. Note the mu[ , ] mark the start of transformed parameters so I cut it off a few lines into that.