Background
I am coding a von Bertalanffy model (a model that fits a curve between animal age and length, see here for a simple R example. The model was previously published and the authors used a JAGS model. There is one other post on von Bertalanffy models on here, but their model is simpler than mine.
The model as implemented has multivariate priors. I was able to code the model using example 9.13 “Multivariate Priors for Hierarchical Models” from the Stan Use manual (v 2.17.0).
The model seems to perform well. The corresponding JAGS model took my collaborators ~4 hours to run with JAGS and this one takes ~5 to 30 minutes depending upon how many iterations I’m running.
Questions
- I could not figure out how to apply “Optimization through Vectorization” and “Optimization through Cholesky Factorization” to the model. Are these optimizations possible with my model?
- Are there any obvious errors that would be slowing down the model or other ways to optimize this code?
data {
int<lower = 0> nFish; // number of fish
int<lower = 0> nSites; // number of sites
int<lower = 0> poolID[nFish] ; // Dummy variable to ID pool
vector<lower = 0>[nFish] age; // Age of fish
real<lower = 0> length[ nFish]; // Length of fish
}
parameters {
vector<lower = 0>[nSites] Linf; //theoretical maximum length
vector[nSites] t0; // hypothetical age at which fish's size = 0
vector[nSites] K; // growth coefficient
corr_matrix[3] Omega; // prior correlation
vector<lower=0>[3] tau; // prior scale
real<lower = 0> sigmaLength; // observation error
row_vector[3] mu_gamma; // group coeffs
}
model {
vector[nFish] vonBplaceholder;
vector[3] mu_beta[nSites]; // site coeffs on log scale
for (fish in 1:nFish){
vonBplaceholder[fish] = Linf[ poolID[fish]] * (1.0 - exp( - K[ poolID[fish]] * ( age[fish] - t0[ poolID[fish]] )));
}
for (site in 1:nSites){
mu_beta[ site, 1] = log(Linf[site]);
mu_beta[ site, 2] = log(K[site]);
mu_beta[ site, 3] = log(t0[site] + 10.0);
}
tau ~ cauchy(0, 2.5);
Omega ~ lkj_corr(2);
mu_gamma ~ normal(1, 1);
{
mu_beta ~ multi_normal(mu_gamma, quad_form_diag(Omega, tau));
}
length ~ normal(vonBplaceholder, sigmaLength);
}
generated quantities {
real<lower = 0> Linf_bar;
real t0_bar;
real K_bar;
Linf_bar = exp(mu_gamma[1]);
K_bar = exp(mu_gamma[2]);
t0_bar = exp(mu_gamma[3]) - 10.0 ;
}
Given that you’re actually putting the multinormal prior on mu_beta, it would make sense to declare it in the parameter block and define Linf, K, and t0 as transformations of it. This would also let you use the non-centered parameterization of the multivariate normal distribution, which could increase efficiency.
Also, the Cauchy prior on tau and the implicit uniform prior on sigma both have very heavy tails. You would probably be better with something like an exponential or a half student-t.
I’ve made some modifications to the code and included them below. Please note that the model now requires three additional hyperparameters as data; this should make it easier to evaluate the prior sensitivity without having to recompile.
data {
int<lower = 0> nFish; // number of fish
int<lower = 0> nSites; // number of sites
int<lower = 0> poolID[nFish] ; // Dummy variable to ID pool
row_vector<lower = 0>[nFish] age; // Age of fish
row_vector<lower = 0>[nFish] length; // Length of fish
// Hyperparameters below:
real hp_tau; // default 2.5
real hp_sigma; // not sure what default is
real hp_omega; // default 2
}
parameters {
cholesky_factor_corr[3] L_Omega; // prior correlation, Cholesky scale
matrix[3, nSites] mu_beta_raw; // This will be transformed into mu_beta
vector<lower=0>[3] tau; // prior scale
real<lower = 0> sigmaLength; // observation error
vector[3] mu_gamma; // group coeffs
}
transformed parameters {
matrix[3, nSites] mu_beta_cor = // correlated site-level variation, without mu_gamma
diag_pre_multiply(tau, L_Omega) * mu_beta_raw;
// This is part of the MVN non-central parameterization
// The mean vector (mu_gamma) still needs to be added, but that is done below
row_vector[nSites] Linf = //theoretical maximum length
exp(mu_beta_cor[1] + mu_gamma[1]);
row_vector[nSites] K = // growth coefficient
exp(mu_beta_cor[2] + mu_gamma[2]);
row_vector[nSites] t0 = // hypothetical age at which fish's size = 0
exp(mu_beta_cor[3] + mu_gamma[3]) - 10.0;
}
model{
row_vector[nFish] vonBplaceholder =
Linf[poolID] .* (1.0 - exp( - K[poolID] .* ( age - t0[poolID])));
L_Omega ~ lkj_corr_cholesky(hp_omega);
to_vector(mu_beta_raw) ~ normal(0,1);
tau ~ exponential(1/hp_tau);
sigmaLength ~ exponential(1/hp_sigma);
length ~ normal(vonBplaceholder, sigmaLength);
}
2 Likes
@Christopher-Peterson Thank you for the code suggestions. I always wondered if priors had to be hard coded in.
Also, I’m doing more experimentation. Specifying priors help the models get consistent results. However, sometimes the original model runs in 1/2 the time as the new version, but doesn’t converge. Otherwise, both take longer. I’m pretty sure it has to do with priors. I’m investigating this more.
I’ll post what I figure out.
As a followup, adjusting the priors to be reasonable made the optimized model behave reasonably and quicker than the un-optimized model.
One strange observation is that the non-optimized model would sometimes have a chain get stuck on a non-reasonable parameter combination. Given that the VB model is non-linear, this is not surprising.
I have posted the code to my GitHub page.
@Christopher-Peterson Is it okay if I acknowledge your contribution in my manuscript?
We thank Christopher R Peterson for his help optimizing our code on the Stan User form \url{Von Bertalanffy model with Multivariate priors}.