I’ve done a very mild generalisation of @Bob_Carpenter’s Lotka Volterra model
in order to demonstrate the power of Stan to my colleagues. There are now several colonies of hares and lynxes with a common hyper-parameter.
This is taking a lot longer to run - the first chain has taken 460 seconds to run - in the original model this was about 30 seconds. I.e. it is about 10 times slower.
-
Is this to be expected?
-
Is there anything I can do to speed things up?
Our real models have e.g. 200 variables and 50 parameters. I am worried that estimation for such models will take even longer.
Are there any case studies that I might have missed? I know of the Friberg-Karlsson model for neutrophil production https://www.metrumrg.com/wp-content/uploads/2017/10/Gillespie_MixedSolver_ACOP2017.pdf but they seem to use a modified version of Stan called Torsten.
The data for the model below is the same as the original data i.e. there is only one colony. I’ve attached the data after the code.
functions {
real[] dz_dt(real t, // time
real[] z, // system state {prey, predator}
real[] theta, // parameters
real[] x_r, // unused data
int[] x_i) {
real u = z[1];
real v = z[2];
real alpha = theta[1];
real beta = theta[2];
real gamma = theta[3];
real delta = theta[4];
real du_dt = (alpha - beta * v) * u;
real dv_dt = (-gamma + delta * u) * v;
return { du_dt, dv_dt };
}
}
data {
int<lower = 0> N; // number of measurement times
int<lower = 0> M; // number of colonies
real ts[N]; // measurement times > 0
real y_init[2]; // initial measured populations
real<lower = 0> p[N, 2, M]; // measured populations
}
parameters {
real<lower = 0> mu[4]; // mean for hares and lynxes
real<lower = 0> phi[4, M]; // { alpha, beta, gamma, delta }
real<lower = 0> z_init[2]; // initial population
real<lower = 0> sigma[2]; // measurement errors
}
transformed parameters {
real z[N, 2];
real w[N, 2, M];
real theta[4];
for (j in 1:M) {
for (l in 1:4)
theta[l] = phi[l,j];
z = integrate_ode_rk45(dz_dt, z_init, 0, ts, theta,
rep_array(0.0, 0), rep_array(0, 0),
1e-5, 1e-3, 5e2);
w[ , , j] = z;
}
}
model {
phi[1,] ~ normal(mu[1], 0.5);
phi[2,] ~ normal(mu[2], 0.05);
phi[3,] ~ normal(mu[3], 0.5);
phi[4,] ~ normal(mu[4], 0.05);
sigma ~ lognormal(-1, 1);
z_init ~ lognormal(log(10), 1);
for (k in 1:2) {
y_init[k] ~ lognormal(log(z_init[k]), sigma[k]);
for (j in 1:M)
p[ , k, j] ~ lognormal(log(w[, k, j]), sigma[k]);
}
}
generated quantities {
real y_init_rep[2];
real p_rep[N, 2, M];
for (k in 1:2) {
y_init_rep[k] = lognormal_rng(log(z_init[k]), sigma[k]);
for (n in 1:N) {
for (j in 1:M)
p_rep[n, k, j] = lognormal_rng(log(w[n, k, j]), sigma[k]);
}
}
}
The data
# Data from http://www.math.tamu.edu/~phoward/m442/modbasics.pdf
# Downloaded 15 October 2017, 4:59 PM EDT
Year, Lynx, Hare
1900, 4.0, 30.0
1901, 6.1, 47.2
1902, 9.8, 70.2
1903, 35.2, 77.4
1904, 59.4, 36.3
1905, 41.7, 20.6
1906, 19.0, 18.1
1907, 13.0, 21.4
1908, 8.3, 22.0
1909, 9.1, 25.4
1910, 7.4, 27.1
1911, 8.0, 40.3
1912, 12.3, 57.0
1913, 19.5, 76.6
1914, 45.7, 52.3
1915, 51.1, 19.5
1916, 29.7, 11.2
1917, 15.8, 7.6
1918, 9.7, 14.6
1919, 10.1, 16.2
1920, 8.6, 24.7