Best Practices for Stan in Production on AWS

billWalker · February 8, 2023, 2:58pm

Hi all,

I’m planning to put a Stan-based model into production (run on a schedule, serve some transformation of outputs to a website) on AWS, as close to “serverless” as possible. I was wondering what the community thinks is the best way to do this.

I’ve been customizing an Amazon Linux 2 AMI to run cmdstanpy. The production process is going to look like:

Start EC2 instance
trigger cmdstanpy code
cron job on instance copies stan output to S3 using AWS cli’s s3 sync
Instance shuts down on successful completion of (2), or on timeout
downstream processes reference stan output on S3

I’ll be running the same code against multiple independent datasets (these are IRT models on a variety of courses), so I’m hoping to scale this up by running the same image in multiple EC2 instances.

Things I’ve learned so far:

httpstan is fiddly, and the zipped output it creates in .cache needs custom processing (it’s not valid JSON)
pystan is tightly integrated with a local httpstan server, and would need adjustment to run with a remote. You can’t just forward ports to localhost and be on your way.
“Amazon Linux 2” on ARM means compiling your own recent Python
I need a fairly beefy instance type to get performance comparable to my local (M1 Pro) machine. I’ve been using an a1.xlarge with ARM processors for testing and it’s not cutting it.

Things I believe to be true:

AMIs can’t be converted to docker images very easily, but the Amazon Linux AMI specifically has a pathway

Questions:

Is this, roughly, the pattern others have been using? Did you start with docker images instead?
Sagemaker and associated bells and whistles (Model Registry, etc.) looks inappropriate for an entirely batch processing workflow needing a custom image. Has anybody tried using Stan within Sagemaker recently?
What instance types and architectures are the best for running Stan code?

darby · February 17, 2023, 4:15pm

I have cmdstanpy deployed in sagemaker as a model, but if you just make it a processing step you won’t lose much. I install cmdstanpy to an AWS mantained scikit-learn image via commands in a dockerfile, then upload the image to AMI, then call that image for the sagemaker steps to run on

I think the main downside to doing this is that you pay 20% more for c5 instances that are 20% slower than the c6i instances you can get on EC2

Luke_Rutten · July 8, 2024, 9:22pm

Hello,
I’m a bit late to the thread here, but I’m very new to Sagemaker and the Stan language and have been encountering issues in getting it operating. Would either of you be willing to share your process for getting Stan to work on Sagemaker to begin with?

Topic		Replies	Views
Stan and aws sagemaker General	9	1686	January 16, 2025
Getting Stan running in AWS Sagemaker CmdStan	1	116	July 24, 2024
Error running Stan on AWS General	3	1084	April 25, 2022
Seamlessly Running Stan Much Faster (using AWS/HPC etc) General	3	1057	May 29, 2021
Machine (docker) Image for rstan on AWS RStan	5	1838	June 12, 2017

Best Practices for Stan in Production on AWS

Related topics