Grainsize for hierarchical model

sonicking · April 21, 2022, 6:18pm

Hello, I am using reduce_sum and within-chain parallelization for my hierarchical model. I have a few questions.

The guide suggests:

For instance, in a model with N=10000 and M = 4, start with grainsize = 25000, and sequentially try grainsize = 12500, grainsize = 6250, etc.

My data is in the long format with I*T rows. Do I treat I*T as N?
Now, suppose I am slicing the data over people, I, in my notation. Do I still use I*T as N or should I use I?

EDIT: I have a follow-up question.

I ran my model without parallelization and got it to converge. I then ran it using parallelization with a grainsize = 1. The results are the same. Then I ran again with a different grainsize. Surprisingly, there are quite a bit more divergent iterations. But the magnitudes of the coefficients stay the same. Can grainsize affect that? I use “start” and “end” quite extensively in the indexing in my code.

Thanks

Bob_Carpenter · April 22, 2022, 6:19pm

Yes, in the sense that’s the amount of data you have. Usually it’s much more efficient to put data in wide form for Stan because then it’s easier to vectorize by I or T.

You want to break up so as to keep the people together. I’m afraid I’m not sure how it deals with arrays, but presumably it doesn’t matter what the elements are, so you’d use I.

That shouldn’t happen if you start with the same random seed. The trajectories should be the same (up to small differences because floating point arithmetic isn’t associative). If you have an example where this happens with the same seed, it’d be great if you could share—it may be a bug somewhere.

sonicking · April 22, 2022, 9:04pm

Hello. Thanks for replying. I am afraid I cannot share the exact data or code because it is something I develop for work. But I will examine more to see what (if any) is wrong.

This is the 2nd time I try to use reduce_sum. For the first time, everything is in the long-format and it successfully reduced the computational time.

I read that it could have even greater reduction if I slice over people, not just rows. But it is quite tedious to get the indexes right with long format data. I am considering re-coding everything using wide format data.

But if I put the data in the wide format, wouldn’t that imply I need to loop over I or T? I have always followed this example and put the data in long format. Can you please provide more insight?

Topic		Replies	Views
Within-Chain parallelization & Seed Modeling rstan	7	77	October 28, 2024
Possible confusion around 'grainsize' argument for 'reduce_sum' Modeling	2	519	September 4, 2020
Stan significantly slower after incorporating multithreading? CmdStan paralellization	6	832	May 3, 2023
Reduce_sum error Modeling	4	879	May 13, 2020
Parallelization via reduce_sum for hierarchical model Modeling performance , hierarchical-model , paralellization	2	640	January 26, 2022

Grainsize for hierarchical model

Related topics