Estimating resources needed on a cluster

zenkavi · December 15, 2020, 10:32pm

I’m working on modeling large time series data using Stan models with ODE solvers. Not posting the specific model I have in mind here (I can in another thread under Modeling later) because I’m not sure it’s doing what I want to yet and my question for this category is more general:

I’m trying to get a sense of the computing resources these models might need on a cluster and where the bottlenecks might be.

I was wondering if there might be an example somewhere benchmarking the performance of the same model run with an ode solver versus an analytical solution.

Also how helpful would the within chain parallelization tools like reduce_sum or map_rect be for models with ODEs?

Any kind of advice (general or specific) would be very welcome. Thanks!

bbbales2 · December 16, 2020, 7:16pm

There’s a paper on modeling bottlenecks: https://statmodeling.stat.columbia.edu/2020/11/10/bayesian-workflow/ .

There’s a section later in it that talks about some of the difficulty of solving problems with ODEs that might be interesting in your case.

I wouldn’t worry so much about computation until you first have a toy problem working. Once you have that working you’ll have a much better idea how much you can scale it.

yizhang · December 16, 2020, 11:36pm

I’m gonna assume you have a system of ODEs with non-trivial size and/or non-trivial solution period (t_0, t_1). Then

Likelihood evaluation of ODE solution will possibly contribute a lot to time cost.
reduce_sum won’t help you on cluster. It’s designed for many-core infrastructure. If you have access to such machine you can try that.
map_rect will work on cluster.
without seeing the problem it’s really hard to gauge what bottlenecks could be.
Properly set up ODE solver should work the same as analytical solution in terms sampling, but far more costly in time, how far depends on specific problem.

zenkavi · December 17, 2020, 4:58pm

Thank you so much, these are very helpful. Are you aware of any example benchmarking the ODE solver performance for an equation with an analytical solution?

yizhang · December 17, 2020, 5:09pm

No, I’m not. It’d be great to have one tho.

Topic		Replies	Views
Within Chain ODE Parallelization Results Developers features	11	1604	March 22, 2017
Extending reduce_sum to use MPI Developers	5	481	June 1, 2021
Map_rect causing substantial slowdown; trying to understand how to fix Modeling cmdstanr , paralellization	13	866	July 24, 2020
Fully working parallel ODE Stan using OpenMP Developers	2	803	February 6, 2017
Costs/benefits of solving large ODE systems as systems of difference equations/in discrete time Modeling performance	9	2457	November 13, 2019

Estimating resources needed on a cluster

Related topics