Hi all,
I’m fitting a bernoulli Gaussian process model for species distribution modeling, and it’s quite slow. I was wondering if anyone had any tips to help speed it up.
Essentially, I have species occurrences (presence only) downloaded from GBIF. For each species there’s probably 200-500 presences (1’s). To generate ‘pseudo-absences’, I randomly sample 1000 background points from the map (i.e. North America) and define them all as absences (0’s). So I have somewhere between 1200-2000 points for fitting a bernoulli GPM.
I’ve followed the code in the STAN manual, and have my model as:
Please share your Stan program and accompanying data if possible.
data{
int<lower=1> N;
row_vector[2] x[N];
int<lower=0,upper=1> y[N];
}
transformed data{
real delta = 1e-9;
}
parameters{
real<lower=0> rho;
real<lower=0> alpha;
real a;
vector[N] eta ;
}
transformed parameters{
vector[N] f;
{
matrix[N,N] L_K;
matrix[N,N] K = cov_exp_quad(x, alpha, rho);
for(n in 1:N){
K[n,n] = K[n,n] + delta ;
}
L_K = cholesky_decompose(K);
f = L_K * eta;
}
}
model{
eta ~ normal(0,1);
rho ~ inv_gamma(5, 5);
alpha ~ normal(0,1);
a ~ normal(0,1);
y ~ bernoulli_logit(a+f);
}
It works fine, the area-under-curves are good, etc. But it is extremely slow:
Gradient evaluation took 0.910985 seconds
1000 transitions using 10 leapfrog steps per transition would take 9109.85 seconds.
Adjust your expectations accordingly!
I don’t remember it being this slow when I first ran the models a few years ago (I put this project aside and came back to it). I tried to speed it up by first optimizing the model, then using the optimized parameters as the starting values, but I’m not sure that helps too much. Does anyone have any advice for speeding this one up?
Thanks
FWIW, here’s the optimizer:
Initial log joint probability = -1736.59
Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
Exception: cholesky_decompose: Matrix m is not positive definite (in 'unknown file name' at line 22)
19 -704.81 0.492961 41.2802 1 1 28
Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
39 -526.872 0.789057 15.5561 0.5514 0.5514 52 [150/527]
Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
59 -482.263 0.117319 11.0234 1 1 73
Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
79 -467.499 0.0949339 17.2224 1 1 93
Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
99 -463.204 0.176057 7.71779 1 1 114
Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
119 -459.926 0.280257 8.86097 1 1 138
Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
139 -451.633 0.368367 22.6551 1 1 160
Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
159 -432.905 0.120548 18.6509 0.4404 0.4404 181
Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
179 -413.768 0.179617 22.0664 0.4095 1 204
Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
199 -396.866 0.150389 33.9249 0.1559 1 227
Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
219 -374.362 0.675426 31.9265 1 1 248
239 -353.261 0.278901 22.4105 1 1 269 [130/527]
Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
259 -334.69 0.139741 27.4685 0.7923 0.7923 290
Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
279 -315.248 0.111221 19.985 1 1 311
Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
299 -295.522 0.192555 32.7448 0.377 1 334
Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
319 -280.72 0.0874414 10.7889 1 1 355
Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
339 -271.754 0.0669797 6.88682 1 1 375
Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
359 -269.35 0.061617 2.52318 1 1 396
Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
379 -268.969 0.0416614 1.23482 1 1 420
Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
399 -268.936 0.00309279 0.364431 1 1 442
Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
419 -268.932 0.00455218 0.22627 1 1 464
Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
439 -268.931 0.000199942 0.0384264 0.5265 0.5265 488
Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
452 -268.931 0.000113569 0.0131722 0.3661 1 502
New update.
I moved the entire transformed parameters section into the model block, and that speed things up so much. The iterations have been cut down by an order of magnitude, and the leapfrog estimates are half the time:
Initial log joint probability = -1839.18
Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
19 -698.413 0.829057 13.1064 1.274 0.1274 23
Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
39 -697.594 0.00118772 0.00106796 0.7247 0.7247 50
Optimization terminated normally:
Convergence detected: relative gradient magnitude is below tolerance
Gradient evaluation took 0.436195 seconds
1000 transitions using 10 leapfrog steps per transition would take 4361.95 seconds.
Adjust your expectations accordingly!
How is it that it’s so much faster inside the model block than inside transformed parameters?