Choosing the right AWS instance

scijens · November 19, 2020, 8:45pm

Hi everyone,
I am generally interested in a procedure to choose the right instance to run Stan models on Amazon AWS. I am using Louis Aslett’s RStudio AMI and want to run RStan on RStudio Server.

My model is quite complex (Hidden Markov Model where the transition probabilities are driven by time-varying covariates). I ran it on my computer (2 cores with 2.00 GhZ each) and 1/15 of my final dataset. It consumes ~ 1GB RAM and ~ 50% of CPU. I am wondering which assumptions I can make to make the right choice for the AWS instance out of all the instances listed here. And how would the computational effort increase if I want to run a 3-state model instead of a 2-state model?

.

Max_Mantei · December 3, 2020, 10:18pm

Hey Jens!

I think the important thing is whether you already have threading in your model or not. In terms of chains you need a sufficiently long warmup and some post warmups. So spamming a lot of chains is not the right way to go. Then it’s important how accurate the results should be, i.e. are you interested in tail quantities etc.? So maybe you can figure out how many chains make sense for your model in this way. Then multiply that by the number of threads that make sense for your model and you get a (really rough) rule of thumb for number of core you want. I’d recommend the compute (CPU) optimized machines.

Also, you might want to check out this thread if you haven’t already:

Mitzi mentions the folk theorem there and from my experience speeding up a model is often impossible if the model is incorrect… Just double check everything and look out for efficiency gains in the code (and then add threading).

I don’t know enough about HMM to make any statement regarding compute time of 3-state vs. 2-state models, sorry.

On a final note: If you are looking for speed you should definitely check out CmdStanR! :)

Hope this was at least a bit helpful.

Cheers,
Max

scijens · December 7, 2020, 2:37pm

Hi Max,

Thanks very much, this is definitely helpful! With cmdstan I achieved a significant increase in speed and I am running a t3.large machine - thanks!

Cheers,
Jens

Topic		Replies	Views
Rstan on remote servers General	9	1946	December 14, 2020
Advice for parallelizing many Stan models with multiple chains Modeling	1	594	September 20, 2022
Run in the cloud option for Stan Interfaces	4	726	May 29, 2021
Speed issues since upgrading to RStan v2.21.2 rstanarm	39	996	August 30, 2020
Significant Differences between `rstan` and `rstanarm` running times Modeling rstanarm	1	1504	August 8, 2020

Choosing the right AWS instance

Related topics