Hi all,
I’m using STAN during my Introduction to Stats course, and I have my students operating it through CoCalc. The issue I’m running into is low memory during the compilation process. I’ve bought 1GB of memory for every student, so each student has 2 GB total (1 GB free, 1 GB purchased for the course). During model compilation, the g++ and clang compilers spike memory use to somewhere between 2-3 GB, so my students are having issues getting their assignments done.
Example code:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import pystan as pyst
# this is that random code from last time we have to run to make it work.
import os
#os.environ["CC"] = "clang++"
#os.environ["CXX"] = "clang++"
#os.environ['CXXFLAGS'] = "--param ggc-min-expand=0 --param ggc-min-heapsize=524288"
t_test_mod = """
data{
int<lower=1> N1;
int<lower=1> N2;
vector[N1] y1;
vector[N2] y2;
}
parameters{
real mu1;
real mu2;
real<lower=0> sd_y;
}
model{
y1 ~ student_t(N1-1, mu1, sd_y);
y2 ~ student_t(N2-1, mu2, sd_y);
}
generated quantities{
real d;
d = mu1 - mu2;
}
"""
ttest_comp = pyst.StanModel(model_code=t_test_mod)
We’ve run some tests (the CoCalc devs and I, and discovered that for this example at least, 3GB appears to be needed to avoid memory errors. Does anyone have any suggestions as to how much memory I should allocate to my students? Is 4GB enough even for somewhat more intensive hierarchical models (does the model complexity even affect g++ mem use?)? I’d appreciate some guidance here, so I can get the memory purchased asap for my students.
Thanks!
-Nate