Dear all,
After a hard time, I have implemented in STAN a bioenergetic model for the size and energy dynamics of an animal (a fish in our case). The model (dynamic energy budged model, or DEB) has theoretical support (Dynamic energy budget theory - Wikipedia). It essentially consists in an ODE system of four coupled differential equations with many parameters. Temperature and food are forcing variables. In addition, none of the state variables is directly observable, thus additional equations (and parameters) are needed for connecting state variables with observations (which typically are length, weight, fecundity, ā¦).
I am interested in estimating the DEB parameters at the fish level. We have several measures along the lifespan of each fish, and many fish (aquaculture data). Thus, the objective is a hierarchical model (i.e., the fish-level DEB parameters are distributed in some way).
The code is at the end.
After simulating data for ten fish (length, weight and fecundity; one simulated observation per year; 12 years lifespan; parameters values, between-fish variability and measurement error were all realistic), I can recover three theoretically relevant DEB parameters when fixing (=assuming that they are known) all the other parameters. No divergent transitions; Rhat close to 1. Accuracy and precision of the estimates for these three target parameters at the fish level are satisfactory four our purpose (excepting for the measurement error of fecundity, for which a bug may remain).
Initial values and priors can be defined based on previous results (About AmP).
The advice we would like to ask today to the forum is how to improve the computation time: 24 hours in a device without memory limitations (10 fishes, 12 observations per fish of three observable variables = 360 observations). Now I would like to start with real data and to play with the many details of the model (e.g., to decide if it is worth to measure additional variables), to include more fish (hundreds) or more observations per fish⦠but the long computational time is discouraging.
So, are there drawbacks in the code? Are there any chance to move from days to minutes or at least to a few hours (e.g., changing tolerance or max number of steps in the numerical integration function)?
We recognize that identifiability is also a concern when dealing with this large number of parameters. In our case, after adding a fourth parameter, the posteriors of two parameters became highly correlated. Fortunately, most of the parameters can be assumed to be invariant at the between-fish level and/or can be estimated after ad-hoc experiments. Thus, any comment on identifiability will be very welcome but I would like to focus first on the computation time technical problem.
I run the code with Rstan, with
options(mc.cores = parallel::detectCores())
rstan_options(auto_write = TRUE)
Thank you
m
The code:
code="
functions {
real[] DEB(
// input
real t, // time
real[] z, // state variables
real[] theta, // parameters to be estimated
real[] x_r, // for passing the fixed parameters (=assumed to be known) to the numerical integrator
int[] x_i // not used but needed for the sintaxis of the numerical integrator
){
// Derivatives
real dEdt; // Reserve Energy (j)
real dVdt; // Structural length (cm)
real dHdt; // Maturation Energy (j)
real dURdt; // Reproduction Energy (j)
// state variables
real E = z[1];
real V = z[2];
real H = z[3];
real UR = z[4];
// Fixed parameters (assumed to be known)
real Kelvin = x_r[1];
real Temp_mean = x_r[2];
real Temp_amp = x_r[3];
real pi2f = x_r[4];
real Temp_phi = x_r[5];
real TA = x_r[6];
real T1 = x_r[7];
real f = x_r[8];
real EG = x_r[9];
real kJ = x_r[10];
//real kappa = x_r[11]; //Caution: the blocked parameters are those to be estimated
//real pAm = x_r[12];
real kM = x_r[13];
//real v = x_r[14];
real Hp = x_r[15];
// parameters to be estimated
real pAm = theta[1];
real v = theta[2];
real logit_kappa = theta[3];
real kappa;
// Auxiliary variables
real Temp; // Temperature
real cT; // Arrhenius temperature correction
real E_V; // energy density
real V23; // surface
real pAm_T; // model paramteters after temperature correction
real v_T;
real kM_T;
real kJ_T;
// fluxes (j/day)
real pA; // energy assimilation
real pM; // simatic maintenace
real pJ; // maturity maintenance
real pC; // mobilization
// temperature correction and auxiliary variables
Temp = Kelvin + Temp_mean + Temp_amp*sin(pi2f*t + Temp_phi); // sinusoidal function for temperature
cT = exp(TA/T1 - TA/Temp); // temperature correction (Arrhenius)
E_V = E/V;
V23 = pow(V,2.0/3.0); // surface
pAm_T=cT*pAm; // paramters after temperature correction
v_T=cT*v;
kM_T=cT*kM;
kJ_T=cT*kJ;
kappa = inv_logit(logit_kappa);
// fluxes
pA=f*pAm_T*V23; // assimilation rate
pM=kM_T*V; // somatic maintenance rate
pJ=kJ_T*H; // maturity maintenance rate
pC=E_V*((EG*v_T*V23+pM)/(kappa*E_V+EG)); // mobilization rate
// derivatives
dEdt=pA-pC;
dVdt=(kappa*pC-pM)/EG ;
dHdt=((1-kappa)*pC-pJ)*(H<Hp);
dURdt=(((1-kappa)*pC-pJ)*(H>=Hp));
return {
dEdt,
dVdt,
dHdt,
dURdt
};
}
}
data {
int<lower=0> N; // number of fish
int<lower=0> n; // number of replicated measures per fish
real ts[n]; // times at which the system is observed
real z0[4]; // initial values of the state variables (assumed to be the same for all the fish)
row_vector[n] length_obs[N]; // simulated length-at-age
row_vector[n] weight_obs[N]; // simulated weigth-at-age
row_vector[n] log_fecundity_obs[N]; // simulated (cumulated) fecundity-at-age
int n_fixed; // number of fixed parameters
real fixed[n_fixed]; // fixed parameters
}
transformed data {
real x_r[n_fixed]; // for passing the fixed parameters to the numerical integrator
int x_i[0]; // not used but needed for the sintaxis of the numerical integrator
x_r = fixed;
}
parameters {
// fish level parameters
real <lower = 50.0, upper = 250.0> pAm_i[N];
real <lower = 0.0, upper = 0.05> v_i[N];
real <lower = 2.0, upper = 5.0> logit_kappa_i[N];
// population level parameters
real <lower = 50.0, upper = 250.0> pAm;
real <lower = 0.0, upper = 0.05> v;
real <lower = 2.0, upper = 5.0> logit_kappa;
// between-fish variability
real <lower = 0.0, upper = 30.0> pAm_sd;
real <lower = 0.0, upper = 1.0> v_sd;
real <lower = 0.0, upper = 3> logit_kappa_sd;
// measurement error
real <lower = 0, upper = 3> sd_length;
real <lower = 0, upper = 10> sd_weight;
real <lower = 0, upper = 1> sd_fecundity;
}
transformed parameters {
real yhat[N,n,4];
for (i in 1:N){
yhat[i,,] = integrate_ode_rk45(DEB, z0, 0.0 , ts, {pAm_i[i],v_i[i],logit_kappa_i[i]}, x_r, x_i);
}
}
model {
// auxiliary fixed parameters
real rho = fixed[16];
real wE = fixed[17];
real muE = fixed[18];
real cw = fixed[19];
real kapR = fixed[20];
real egg_energy = fixed[21];
// priors parameters
// population level
pAm ~ normal(130,100);
v ~ normal(0.02,0.05);
logit_kappa ~ normal(3,2);
// between-fish variability
pAm_sd ~ normal(10,10);
v_sd ~ normal(0.2,0.5);
logit_kappa_sd ~ normal(0.5,1);
// fish level
pAm_i ~ normal(pAm,pAm_sd);
v_i ~ lognormal(v,v_sd);
logit_kappa_i ~ normal(logit_kappa,logit_kappa_sd);
// measurement errors
sd_length ~ normal(0.3,1);
sd_weight ~ normal(1,2);
sd_fecundity ~ normal(0.05,0.1);
// Likelihood
for (i in 1:N){
length_obs[i,] ~ normal((to_row_vector(cbrt(yhat[i,,2]))/rho),sd_length);
weight_obs[i,] ~ normal((to_row_vector(yhat[i,,2]) + to_row_vector(yhat[i,,1])*cw*wE/muE),sd_weight);
log_fecundity_obs[i,] ~ normal((log(1e-20 + to_row_vector(yhat[i,,4])*kapR/egg_energy)),sd_fecundity);
}
}
generated quantities {
}
"