Estimating resources needed on a cluster

I’m working on modeling large time series data using Stan models with ODE solvers. Not posting the specific model I have in mind here (I can in another thread under Modeling later) because I’m not sure it’s doing what I want to yet and my question for this category is more general:

I’m trying to get a sense of the computing resources these models might need on a cluster and where the bottlenecks might be.

I was wondering if there might be an example somewhere benchmarking the performance of the same model run with an ode solver versus an analytical solution.

Also how helpful would the within chain parallelization tools like reduce_sum or map_rect be for models with ODEs?

Any kind of advice (general or specific) would be very welcome. Thanks!

1 Like

There’s a paper on modeling bottlenecks: .

There’s a section later in it that talks about some of the difficulty of solving problems with ODEs that might be interesting in your case.

I wouldn’t worry so much about computation until you first have a toy problem working. Once you have that working you’ll have a much better idea how much you can scale it.


I’m gonna assume you have a system of ODEs with non-trivial size and/or non-trivial solution period (t_0, t_1). Then

  • Likelihood evaluation of ODE solution will possibly contribute a lot to time cost.
  • reduce_sum won’t help you on cluster. It’s designed for many-core infrastructure. If you have access to such machine you can try that.
  • map_rect will work on cluster.
  • without seeing the problem it’s really hard to gauge what bottlenecks could be.
  • Properly set up ODE solver should work the same as analytical solution in terms sampling, but far more costly in time, how far depends on specific problem.

Thank you so much, these are very helpful. Are you aware of any example benchmarking the ODE solver performance for an equation with an analytical solution?

No, I’m not. It’d be great to have one tho.