Vector or real array; enforced mixing in STAN code

pasjor · July 12, 2021, 12:45pm

Hello,

I have a very general question not related to any particular Stan code. It deals with the distinction between a vector and a real[] and their incompatibility.

As far as I know, it is usually adviced to use vectors in order that, for example, operations like “-” can be vectorized. Thus, it seems good practice to use vectors instead of real[] when trying to avoid loops.
However, there are important functions, like the integrate_ode_rk45 function, which basically require the user to write a function signature containing a real[] and thus prohibiting the definition of a vector (afaik).

If I want to use the ODE-solver and if I at the same time require operations like “-” to be well defined (component-wise), then it seems to me like I am forced to constantly call functions like “to_vector()” or “to_array_1d()” in order to shift between the different aims of usage.

With respect to this setup, I have two questions:

Does calling “to_vector” allocate a new vector, or does it reuse the old “real[]”. Closely related: Is it efficient to call these conversion functions or is there a significant cost to it?
What is the generally recommended way to deal with the above setup?

Thank you very much for any hints/advices!

martinmodrak · July 18, 2021, 3:33pm

I think @syclik or @stevebronder might be better positioned to answer this, my understanding is a bit shallow. However in most models, the computation time is dominated by the autodiff (computing the gradient). ODE models are IMHO very likely to be severely limited by autodiff. The conversions between vectors and arrays AFAIK shouldn’t modify the autodiff stack ( the underlying vari objects should not be touched by the conversion) and so even if there is copying (which I suspect is the case), it is unlikely to have any noticeable impact on the overall performance.

Note also that recent Stan (I think 2.26+) supports profiling so you can check if the conversion has noticeable impact in your model.

For the old interface your approach would be recommended. Recent Stan also has the new ode_rk45 which primarily uses vectors and avoids the need to pack/unpack additional data to arrays.

Best of luck with your model!

stevebronder · July 21, 2021, 3:30pm

Apologies for the late reply

to_vector is a no-op that reuses the memory of the array, but to_array etc. does force us to copy memory

pasjor · July 23, 2021, 2:35pm

Thank you very much for these explanations. I guess I have to dive somewhat into the autodiff-implementation to better understand the performance issues.
With respect to my code, I should probably consider switching to a more recent version of Stan.

pasjor · July 23, 2021, 2:37pm

Thanks for your reply and for the clear statement on “to_array” and “to_vector”. As I use these a lot, it is very helpful to know the underlying memory usage.

Topic		Replies	Views
Newbie question: difference between real variable[N]; and vector[N] variable; General	3	805	December 19, 2018
Vectorization of real valued parameter Modeling	8	585	January 9, 2020
Vector definitions for data Modeling	4	517	January 29, 2022
ODE with parameters as vectors Modeling	8	699	March 15, 2022
Copying arrays of vectors into matrices to make use of to_vector Modeling	2	471	January 28, 2022

Vector or real array; enforced mixing in STAN code

Related topics