Question about STAN memory use during g++ compilation

Hi all,
I’m using STAN during my Introduction to Stats course, and I have my students operating it through CoCalc. The issue I’m running into is low memory during the compilation process. I’ve bought 1GB of memory for every student, so each student has 2 GB total (1 GB free, 1 GB purchased for the course). During model compilation, the g++ and clang compilers spike memory use to somewhere between 2-3 GB, so my students are having issues getting their assignments done.

Example code:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import pystan as pyst

# this is that random code from last time we have to run to make it work.
import os
#os.environ["CC"] = "clang++"
#os.environ["CXX"] = "clang++"
#os.environ['CXXFLAGS'] = "--param ggc-min-expand=0 --param ggc-min-heapsize=524288"

t_test_mod = """
data{
    int<lower=1> N1;
    int<lower=1> N2;
    vector[N1] y1;
    vector[N2] y2;
}
parameters{
    real mu1;
    real mu2;
    real<lower=0> sd_y;
}
model{
    y1 ~ student_t(N1-1, mu1, sd_y);
    y2 ~ student_t(N2-1, mu2, sd_y);
}
generated quantities{
    real d;
    d = mu1 - mu2;
}
"""
ttest_comp = pyst.StanModel(model_code=t_test_mod)

We’ve run some tests (the CoCalc devs and I, and discovered that for this example at least, 3GB appears to be needed to avoid memory errors. Does anyone have any suggestions as to how much memory I should allocate to my students? Is 4GB enough even for somewhat more intensive hierarchical models (does the model complexity even affect g++ mem use?)? I’d appreciate some guidance here, so I can get the memory purchased asap for my students.

Thanks!
-Nate

I think most models can be compiled using clang++ within 2GB of RAM. You may have to use the -flto option and set up the linker accordingly.

Just a warning that PyStan (actually distutils) probably only uses CC keyword. There is extra_compile_args that goes to StanModel and in future extra_link_args (maybe 2.21).

So since this is all new to me, can you all help me with the appropriate code?

Would it be:

os.environ['CC'] = "clang++"

and then something like:

pyst.StanModel(model_code=t_test_mod, extra_compile_args = "-flto ")

Thanks! I’ve never had to modify the compilers before, so I’m out of my depth here.

Should work, if not, use clang (distutils selects compiler automatically)