Please share your Stan program and accompanying data if possible.
Hi,
I have data from biological microcosms (3 replicates of population growth of unicellular organisms). They follow some sort of boom-bust dynamics, where they grow exponentially, and then instead of saturating (like a logistic model that is usually used in my field), they decay again. I want to model these dynamics with the following system of ordinary differential equations (ODE):
dN/dt = (r0_N - \\alpha_NN - d_NEnv)N
dEnv/dt = (r0\_{Env} - \\alpha\_{Env}Env)Env
where N is the population of interest, r0_N its intrinsic growth rate, \\alpha_N is the population’s self-regulation, d_N is the detrimental effect of an Environment Env, that itself grows logistically with similar parameters.
I do not observe data for the environment. What I want to do therefore is to model it as some sort of latent variable. To reduce dimensionality, I set the equilibrium density of Env r0\_{Env}/\\alpha\_{Env} = 1 and I only fit as free parameters the growth rate and the starting value of Env.
The problem that I face with that is that the value of Env obviously changes with time and therefore constitutes some sort of dynamical parameter for my ODE solver in stan. I have attempted to solve this problem with the following code:
functions{
// system of differential equations
vector odemodel(real t, vector N, vector p){
// p[1] = r0Tet, p[2] = alphaTet, p[3] = dTet, p[4] = r0Env, p[5]=t[1], p[6] = EnvMonoInit;
real Env;
Env = 1/(1+(1-p[6])/p[6] * exp(-p[4]*(t-p[5]))); // analytical solution for Env that takes the current time t as input
vector[1] dNdt;
dNdt[1] = (p[1] - p[2]*N[1] - p[3]*Env)*N[1]; // numerical solution for N
return dNdt;
}
}
data{
int n; // time dimension
int mt; // replicate dimension culture
array[n] real t; // time stamps
array[mt,n] real TetMono; // replicate time series data
int xTetMono; // number of NAs in the data
array[xTetMono,2] int missIndTetMono; // indeces of the NAs in the data
}
parameters{
real<lower=0> r0Tet;
real<lower=0> alphaTet;
real<lower=0> dTet;
real<lower=0> r0Env;
vector<lower=0>[mt] TetMonoInit;
vector<lower=0>[xTetMono] TetMonoMiss;
vector<lower=0>[mt] EnvMonoInit;
vector<lower=0>[1] sigma;
}
transformed parameters {
// parameters for ODE solver
vector[6] p;
p[1] = r0Tet;
p[2] = alphaTet;
p[3] = dTet;
p[4] = r0Env;
p[5] = t[1];
// ODE solutions
array[n-1] vector[2] Nsim;
for(i in 1:mt){
// integrate the ODE
p[6] = EnvMonoInit[i];
Nsim = ode_rk45(odemodel,[TetMonoInit[i]]',t[1],t[2:n],p);
}
// imputing NAs
array[mt,n] real TetMonoImputed = TetMono;
for(i in 1:xTetMono){
TetMonoImputed[missIndTetMono[i,1],missIndTetMono[i,2]] = TetMonoMiss[i];
}
}
model{
// priors
r0Tet ~ lognormal(log(0.1),1);
alphaTet ~ lognormal(log(0.00001),1);
dTet ~ lognormal(log(0.01),1);
r0Env ~ lognormal(log(0.01),1);
TetMonoInit ~ lognormal(log(100),1);
TetMonoMiss ~ lognormal(log(100),1);
EnvMonoInit ~ lognormal(log(0.1),1);
sigma ~ gamma(2,0.1);
// Tetrahymena monoculture likelihood evaluation
for(i in 1:mt){
TetMonoImputed[i,1] ~ normal(TetMonoInit[i],sigma[1]);
for(j in 2:n){
TetMonoImputed[i,j] ~ normal(Nsim[j-1,1],sigma[1]);
}
}
}
generated quantities {
// pointwise log-likelihood to use for LOOCV and WAIC
vector[n*mt] log_lik; // dimension: time points x replicates
int k = 0; // initialize running variable (required for vector)
for(i in 1:mt){
for(j in 1:n){
k = k+1; // two species
if(j == 1){
log_lik[k] = normal_lpdf(TetMonoImputed[i,j] | TetMonoInit[i], sigma[1]);
} else{
log_lik[k] = normal_lpdf(TetMonoImputed[i,j] | Nsim[j-1,1], sigma[1]);
}
}
}
}
My input data looks as follows:
> dataStan
$n
[1] 14
$mt
[1] 3
$t
[1] 1.5 7.0 21.5 31.0 45.5 55.0 69.5 79.0 103.0 165.5 219.5 269.0 332.5 387.5
$TetMono
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14]
[1,] 46.77989 153.08508 536.4281 4023.490 9361.860 13908.05 11640.77 12211.79 16765.83 8998.686 4688.913 4756.142 3205.403 1912.0929
[2,] 79.69388 62.88673 357.5720 2087.447 6511.928 11045.94 17051.13 17528.59 18832.27 8365.056 2128.625 3454.289 3117.165 850.7217
[3,] 47.48019 1.00000 402.9513 1580.852 4827.993 10110.34 16180.38 16636.97 18639.68 8424.862 1563.625 3867.044 3098.957 1040.6424
$xTetMono
[1] 1
$missIndTetMono
[,1] [,2]
[1,] 3 2
When I compile and run this model, it compiles without error, and it runs perfectly (and quickly) throughout the warmup phase. However, when it reaches the sampling phase, it throws the following error (across chains):
Chain 2 latent: stan/lib/stan_math/lib/eigen_3.4.0/Eigen/src/Core/DenseCoeffsBase.h:410: Eigen::DenseCoeffsBase<Derived, 1>::Scalar& Eigen::DenseCoeffsBase<Derived, 1>::operator[](Eigen::Index) [with Derived = Eigen::Matrix<double, -1, 1>; Scalar = double; Eigen::Index = long int]: Assertion `index >= 0 && index < size()' failed.
When I ran this model without the addition of the Env variable, it ran perfectly fine. It seems like the addition of that step in the ODE solver creates some sort of indexing issue, but I cannot figure out what it is.
I would appreciate help with this - thank you very much in advance!