Speed issues since upgrading to RStan v2.21.2

willte · August 13, 2020, 12:48pm

Since upgrading to RStudio (v1.3.1056), RTools (v4.0) and Rstan (2.21.2), the time it takes my model to run seems to have roughly 2x-3xed. I appreciate it could be so many things but I was just wondering if anything jumped out to people here?

I followed the installation instructions here (https://github.com/stan-dev/rstan/wiki/RStan-Getting-Started) and unfortunately can’t recall what versions I had of everything before but I expect everything was last updated about a year ago.

Sorry this is such a vague question but as my model previous took about 20 hours to run, and I’m typically running it 1-2 times per week, any speed increases are of great use. I thought it was worth a shot asking here!

rok_cesnovar · August 13, 2020, 1:17pm

Can you share anything about your model?

Does it use ode solvers, algebra solvers, anything special? Do you know what the bottleneck is in your model? Its hard to say much without any details.

We did not observe any regression between 2.19 and 2.21 of this magnitude, at least that we know of.

willte · August 13, 2020, 1:23pm

Sure, I’m actually modelling underdispersed count data so I use a generalised poisson pdf. This was built for me by someone much smarter than me.

One of the quirks though is that I have to set init = 0 as otherwise the initialisation of the sampling fails. From I call it returns negative infinity. Perhaps that’s off topic but just in case that would effect anything on the speed front.

Here is the pdf:

functions {
  // Generalized poisson with its mean and standard deviation as parameters.
  real generalized_poisson_ms_log(int n, real mu, real sigma2) {

    // Reparametrization.
    real theta  = pow(mu, 1.5) / sqrt(sigma2);
    real lambda = 1 - sqrt(mu) / sqrt(sigma2);
    if ((theta + lambda * n) < 0) {
      return negative_infinity();
    } else {
      return log(theta)
             + (n - 1) * log(theta + n * lambda)
             - lgamma(n + 1)
             - n * lambda
             - theta;
    }
  }
}

rok_cesnovar · August 13, 2020, 1:47pm

This might be related to the sampler changes frlm 2.19 to 2.21 or you may have uncovered some unkown regression.

You could debug this by trying out the cmdstanr interface that uses 2.24 stan version.

andrjohns · August 13, 2020, 2:03pm

Does anything else happen in your model or does the model block just include that function call? There might be something else that was changed since 2.19.

Also you mentioned that it used to take 20 hours to run, is this because you have a very large dataset or a very complex model?

bbbales2 · August 13, 2020, 2:05pm

Is the additional time spent in warmup or sampling?

rok_cesnovar · August 13, 2020, 2:28pm

There was an lgamma regression that I remember, but cant remember which version had that. Though that was not a 2x-3x slowdown.

andrjohns · August 13, 2020, 2:34pm

That’s a good point, it was this PR that changed lgamma, might be the culprit

andrjohns · August 13, 2020, 2:40pm

The issue there was that the threadsafe version of lgamma (lgamma_r) wasn’t available in mingw32, but I wonder if this is still the case now that its at RTools4. What do you think @wds15?

willte · August 13, 2020, 2:47pm

Unfortunately both! I typically about 35k data points with about 30 parameters. I do also get quite variable chain speeds, particularly as the data set gets bigger, but the model always converges to sensible values without divergent transitions.

I’ve tried to iron the variable chain speeds out but still struggle a bit. One methodology that’s helped is setting increasingly more thoughtful and sensible priors (sometimes hard to think about on the log scale) but this has only achieved so much. The restriction of having to start the warm up at init = 0 I feel is part of the problem, as some of the parameters posterior would never have a value in that range. I would like to place sensible upper and lower bounds on these parameters away from 0, but can’t because the warmup fails (without init = 0).

willte · August 13, 2020, 2:48pm

It seems sampling

willte · August 13, 2020, 2:49pm

I do now get that warning:

In system(paste(CXX, ARGS), ignore.stdout = TRUE, ignore.stderr = TRUE) : 'C:/Rtools/mingw_/bin/g++' not found

But from what I understand from another post I can ignore that.

bbbales2 · August 13, 2020, 2:50pm

Are the number of leapfrogs the same? Try: get_num_leapfrog_per_iteration(fit)

Do you have a saved model from the old run that you can look at by any change (I’m not even sure how possible it is to serialize these things so I won’t be surprised if no)?

(Edit: What we’re doing here is trying to figure out if this is a sampling thing – if the sampler seems to be doing a different thing, or if this is just a computational thing – that stuff is slower per gradient evaluation)

andrjohns · August 13, 2020, 2:51pm

Are you able to post the full model? That will help us narrow things down

willte · August 13, 2020, 3:23pm

I seem to get a bunch of leapfrog numbers, the mean values:

Old model: 176
New model: 448

bbbales2 · August 13, 2020, 3:52pm

Well that’ll do it. I’ll have to get back to you on this, but can you share your model/data or is that private?

willte · August 13, 2020, 3:56pm

I would need to get permission from a bunch of other people first. Can this be avoided?

I really appreciate the responses I’ve been getting on this, wasn’t expecting so much feedback so quickly!

bbbales2 · August 13, 2020, 3:56pm

And what’s the effective sample size of some parameters between the two? Is it the same (in which case the new model is working much harder to do the same thing)?

I’d look at the parameters with low ESS to do this comparison.

bbbales2 · August 13, 2020, 3:59pm

Yeah. I’ll send you more instructions later but what we’ll need to do is a bit of a binary search to figure out where in time this problem occurred so we can identify the commits that might have caused it.

stevebronder · August 13, 2020, 4:12pm

Are you using rstan? There are ways to set init to the values you want and not just 0 or random. I don’t have it on hand, but you can use a previous run to get the dimensions for all the parameters and set those to sensible values like rng outputs of your priors

Topic		Replies	Views
From fast to slow sampling on cluster after reset and older rstan version installed General	8	677	January 27, 2021
Rstan 2.19.2 slower than 2.18.1 Developers rstan	15	1243	August 27, 2019
Rstan on remote servers General	9	1962	December 14, 2020
RStan: long compiling time for a simple practice model Interfaces rstan	4	77	May 16, 2025
Nonlinear increase in step speed with more rows of data -- possible bug? General	7	539	November 5, 2018

Speed issues since upgrading to RStan v2.21.2

Related topics