Hey all, I’ve been messing with the coronavirus data, using Stan to fit sigmoid functions to the pandemic spread. Simple sigmoid curves fit the data well, as seen here: https://www.kaggle.com/peopletrees/coronavirus-stan-sigmoid-curves

I coded the sigmoid model in Stan as follows (good fit, no divergent transitions):

```
data {
int<lower=1> N; // number of rows
vector<lower=0>[N] y; // outcome
vector<lower=0>[N] t; // time in days
}
parameters {
real<lower=0> alpha;
real<lower=0> beta;
real<lower=0> M;
real<lower=0> sigma;
}
model {
vector[N] mu;
mu = M ./ (1 + exp(-beta * (t - alpha)));
alpha ~ normal(100, 100); // number of days at which expected # of counts is halfway the maximum
beta ~ normal(0.5, 5); // growth parameter
M ~ normal(1000, 10000); // maximal number of cases
y ~ normal(mu, sigma);
}
```

However, a multilevel model seems appropriate because so many countries have so few cases at this point (thankfully). To that end, I tried rewriting this as a multilevel model as follows:

```
data {
int<lower=1> N; // number of rows
int<lower=1> J; // number of locations
int<lower=1,upper=J> location[N]; // the locations themselves
vector<lower=0>[N] y; // outcome
vector<lower=0>[N] t; // time in days
}
parameters {
vector<lower=0>[J] alpha;
vector<lower=0>[J] beta;
vector<lower=0>[J] M;
real<lower=0> M_mu; // average maximal number of cases
real<lower=0> M_sigma;
real<lower=0> alpha_mu; // average number of days at which expected # of counts is halfway the maximum
real<lower=0> alpha_sigma;
real<lower=0> beta_mu; // average growth parameter value
real<lower=0> beta_sigma;
real<lower=0> sigma;
}
model {
vector[N] mu;
for ( i in 1:N ) {
mu = M[location[i]] ./ (1 + exp(-beta[location[i]] * (t - alpha[location[i]])));
}
M ~ normal(M_mu, M_sigma);
alpha ~ normal(alpha_mu, alpha_sigma); // number of days at which expected # of counts is halfway the maximum
beta ~ normal(beta_mu, beta_sigma); // growth parameter
// hyperparameters
M_mu ~ normal(1000, 10000);
alpha_mu ~ normal(100, 100);
beta_mu ~ normal(0.5, 5);
y ~ normal(mu, sigma);
}
```

However, this model crashes during sampling as it uses up >16GB of ram. That surprises me since there are only 17k rows in this dataset. Any ideas for troubleshooting this?