`reverse_pass_callback()` with types which are nontrivially destructible

Hi all,
I encounter the use case of a reverse_pass_callback with a lambda function which capture types by value which have a nontrivial destructor. Did you have think about this use case in the past and decided for some reason that there should not be a function for this case?

Background:
I use my custom tensor type which has dynamic data so that the specialization of vari_value inherits from vari_base and chainable_alloc pretty similar to vari_value<Eigen::SparseMatrix>. When I have for example a multiplication between a var_value<MyTensor> and a constant MyTensor, I need to capture MyTensor for the backwards pass. When using the current reverse_pass_callback, this leads to memory leakage because the destructor of MyTensor is never called. A solution would be to have a function reverse_pass_callback_alloc which inherits also from chainable_alloc.

For Stan math

For reverse mode we put all of our dynamic memory in our arena allocator hence we can assume all of their memory is managed elsewhere and the types are effectively trivially destructible. If you need to use a type that then needs a destructor called during the cleanup of memory after reverse mode is done then check out make_chainable_ptr. That function will place your object on a stack so it can be deleted during the cleanup.

Is there a reason you use a custom tensor type instead of something like Eigen’s tensor type? imo I think it would not be too bad and also quite nice for us to support those (I don’t have time tho :-/) .And is there a reason your custom tensor type could not be handed memory allocated via Stan’s arena allocator? I’d check out how Eigen::Map map works as that is essentially how Stan represents the matrices in var_value<Eigen::MatrixXd>

1 Like

make_chainable_ptr probably will also work, thanks! I will try it out. I guess, it will be pretty similar to a

class make_reverse_callback_alloc_impl : public vari_base, public chainable_alloc {

}

Its a custom class because it is not for dense tensors but for sparse tensors due to symmetries. It can have a custom backend for the dense part of the tensors and when its used with Eigen, it can be allocated in the memory arena. But there are other backends which directly uses malloc and does not provide a Map class because the data is distributed.

2 Likes

Inheriting from multiple classes is pretty nasty imo i wouldnt do that. Tbh no idea what happens when one inherited type has a deleter and the other does not.

Yes one has to be a little careful, but at least if I got it right, inheriting from two classes is the way of handling types with dynamic data in the reverse mode framework. vari_value<Eigen::SparseMatrix> does it and I implemented it the same for my class.