Request for Volunteers to Test Adaptation Tweak

bbbales2 · October 12, 2019, 1:29pm

So let me go ahead and summarize this.

As a reviewer, I asked for a test (Request for Volunteers to Test Adaptation Tweak - #64 by bbbales2)

As a developer, you suggested I could do this test myself (Request for Volunteers to Test Adaptation Tweak - #66 by betanalpha)

I asked you to do this test yourself. As a reviewer I felt I’d already done too much testing here (Request for Volunteers to Test Adaptation Tweak - #48 by bbbales2)
As a tech lead you sent me an e-mail suggesting that the tests I was asking from developer-you were inappropriate
I contested that and provided evidence why developer-you is fallible (Request for Volunteers to Test Adaptation Tweak - #91 by bbbales2) and said that if tech-lead-you wanted to contest this we could talk to the SGB.
Tech-lead you reiterated that again this wasn’t about developer-you

Regarding the SGB, SGB-you stepped in to say this also didn’t involve the SGB, and in fact the conflict between tech-lead-you and myself (not a tech lead) should be resolved by the local tech lead (you, in this case).

Tech-lead-you verbally approved the pull request from developer-you and requested that it be merged by the technical working director @seantalts (Feature/2789 improved stepsize adapt target by betanalpha · Pull Request #2836 · stan-dev/stan · GitHub). There were no external code reviews at this point. It was merged 30 minutes later.

In paragraph form, I asked, as the reviewer, you to verify the behavior of adapt_delta. This was based on input from @Bob_Carpenter in this thread, and everyone who came to the Stan meeting (which is publicly announced Request for Volunteers to Test Adaptation Tweak - #58 by bbbales2). In that meeting, everyone basically thought the changes sounded good (@jonah and @bgoodri in particular), and @jonah agreed we should at least verify adapt_delta behavior. @ariddell and @ahartikainen never responded. I think that addresses this complaint from tech-lead-you:

So I asked developer-you to verify the adapt_delta behavior. And I understand that divergences are an indicator that things have gone awry, but considering these changes to the timestep adaptation are most different for problems with divergent trajectories, I judged that it would be prudent to test the behavior of the new adaptation on problems where divergent transitions arise, regardless of the statistical validity of the output. Regarding the central limit theorem, I don’t believe warmup as implemented in Stan is reversible, and so I don’t know how much these arguments about CLT and whatnot apply (given that timestep adaptation is happening in warmup).

That’s what lenient means.

If the relation between adapt_delta and timestep is monotonic, then why didn’t we simply lower the default adapt_delta target with the previous implementation? If we’re only testing on models without divergences, what’s the problem? We can achieve the same difference right?

Oh okay, that’s cool (Request for Volunteers to Test Adaptation Tweak - #49 by betanalpha).

MODERATOR EDIT: The following section contained personal attacks which have been removed (by @martinmodrak)

So then all this difficult work you’ve done [here] is to avoid writing tests for a pull request that could be replaced by lowering the adapt_delta defaults?

I just did the tests myself in 10 minutes. They seem fine to me. I reject this pull request.

Topic		Replies	Views
Much lower effective sample size with 2.15 Developers	9	1476	April 26, 2017
Details on how Stan adaptively tunes the HMC parameters? (i.e. mass matrix, step size and leapfrog steps) General algorithms	4	680	February 20, 2023
Low effective sample size after running Bayesian cognitive model in Stan Modeling rstan , fitting-issues	8	786	August 18, 2021
Acceptance ratio in NUTS Developers	4	910	June 28, 2021
Comparing Stan's adaptation phase to that of nuts-rs? Algorithms	20	1633	August 11, 2023

Request for Volunteers to Test Adaptation Tweak

Related topics