Nonlinear increase in step speed with more rows of data -- possible bug?

Charles_Driver · November 2, 2018, 9:07pm

I’m kind of confused by this. Using rstan 2.18.1. Everything in the model (hierarchical state space) is the same, just the loops over N rows of data are obviously longer. 5400 rows of data (180 subjects) took ~ 0.17s for gradient calcs, while 5800 (200 subjects) took ~ 0.9s . With fewer than 5400 rows, speed changes are roughly linear, as I would expect. These differences in grad calc time are not spurious, they are at least roughly mirrored by actual sampling performance with a fixed, low treedepth – I started looking into it because of unexpected slow performance. The win7 64 PC has heaps of RAM and is only allocating 500mb or so per chain anyway. This nonlinearity a) doesn’t manifest using ubuntu on my laptop, b) seems increased using o3 compiler flags, c) doesn’t occur with rstan 2.17.4, d) isn’t specific to the data (dropped from both front and back). I can’t post the data but if needed I can generate some and try to reproduce issue…
edit: now it is occurring with 2.17.4…

bgoodri · November 3, 2018, 12:07am

Is that with StanHeaders 2.17.x? There are potentially a lot of changes between 2.17 and 2.18.

Charles_Driver · November 3, 2018, 12:17am

yes, though the fact that it reported the ‘correct’ faster grad calcs once with the full data after downgrading does make me wonder. I have the same c++14 makevars setup in both cases, perhaps I should change that?

bgoodri · November 3, 2018, 12:20am

I don’t think the C++14 flag by itself should make much difference but it is worth a try. Also, are you using clang++ or g++?

Charles_Driver · November 3, 2018, 12:29am

on the laptop:

Reading specs from /home/linuxbrew/.linuxbrew/Cellar/gcc/5.5.0_4/lib/gcc/x86_64-unknown-linux-gnu/5.5.0/specs
COLLECT_GCC=g++
COLLECT_LTO_WRAPPER=/home/linuxbrew/.linuxbrew/Cellar/gcc/5.5.0_4/libexec/gcc/x86_64-unknown-linux-gnu/5.5.0/lto-wrapper
Target: x86_64-unknown-linux-gnu
Configured with: …/configure --with-isl=/home/linuxbrew/.linuxbrew/opt/isl@0.18 --with-bugurl=Issues · Homebrew/linuxbrew-core · GitHub --prefix=/home/linuxbrew/.linuxbrew/Cellar/gcc/5.5.0_4 --enable-languages=c,c++,objc,obj-c++,fortran --program-suffix=-5 --with-gmp=/home/linuxbrew/.linuxbrew/opt/gmp --with-mpfr=/home/linuxbrew/.linuxbrew/opt/mpfr --with-mpc=/home/linuxbrew/.linuxbrew/opt/libmpc --enable-stage1-checking --enable-checking=release --enable-lto --with-build-config=bootstrap-debug --disable-werror --with-pkgversion=‘Homebrew gcc 5.5.0_4’ --with-boot-ldflags='-static-libstdc++ -static-libgcc ’ --disable-nls --disable-multilib
Thread model: posix
gcc version 5.5.0 (Homebrew gcc 5.5.0_4)

Charles_Driver · November 3, 2018, 2:41pm

Well this was a wonderfully confusing time waster that seems to have just evaporated. After fully correcting the package for the older rstan 2.17.4, the problem doesn’t exist. Then, after upgrading to rstan 2.18.1, the problem doesn’t exist. My only guesses are that a) it will come back, or b) rstan 2.18.1 might have been built from source before, perhaps under some semi correct make setup that led to this weirdness.

avehtari · November 4, 2018, 1:40pm

I think this increase in time could be explained by speed of different memory caches. See this excellent series of blog posts discussing why memory access time is nonlinearly increasing
http://www.ilikebigbits.com/2014_04_21_myth_of_ram_1.html
This is highly recommended reading for anyone making speed tests with Stan or any other software (non-parallel or parallel)

Aki

Charles_Driver · November 5, 2018, 2:01pm

That logic makes a lot of sense, I don’t know where it would apply to this case, as I’d have thought we were already onto RAM with the reduced data cases, but maybe I’m lacking imagination :) :)

Topic		Replies	Views
Speed issues since upgrading to RStan v2.21.2 rstanarm	39	994	August 30, 2020
From fast to slow sampling on cluster after reset and older rstan version installed General	8	664	January 27, 2021
Rstan 2.19.2 slower than 2.18.1 Developers rstan	15	1241	August 27, 2019
Speed difference between rstan and cmdstan for a simple model CmdStan rstan , techniques	25	3224	November 7, 2021
Stan_glmer slow when data variance is really small rstanarm	20	1691	April 8, 2018

Nonlinear increase in step speed with more rows of data -- possible bug?

Related topics