Hi Stan developers and users,
I have some issues with fitting a Gaussian process model to my data. I am defining a Gaussian process to model SCMdata variable in response to 9 input variables, dealing with 120 data points to fit my model. The linear mean and squared exponential covariance functions are defined for my GP model. You could see the data that I am using from model.data.R file attached.
I am providing initial values for parameters and very concentrated prior for my \sigma^2. However, it takes ages (at least 3 hours) to run the model. Also, I keep getting the message from Stan about leapfrog steps and adjusting expectations, that I don’t really understand.
I am attaching my R code together with Stan file and I would really appreciate any suggestions.
Since your length scale is fixed, move the GP stuff into transformed data.
So do this in transformed data:
matrix[N1, N1] Sigma;
matrix[N1, N1] L;
Sigma = cov_exp_quad(XScaled, 1.0, length_scale);
for(k in 1:N1) Sigma[k, k] = Sigma[k, k] + nugget;
L = cholesky_decompose(Sigma);
We have to change how the variance term is handled. The covariance matrix scales linearly with it so we just bring it out (I think it’s included in this function for autodiff efficiency, not math reasons, but if we can avoid autodiffing the Cholesky that’s worth more).
Instead of modeling sigma_sq, change that to just sigma, and use this for your model line:
y1 ~ multi_normal_cholesky(Mu, sigma * L);
This should be faster since you aren’t doing the Cholesky factorization every iteration.
Hope that helps!
The leapfrog message is a very rough estimate of the time your chains will take to run. In a complex model like a Gaussian process is can be a substantial underestimate of the true running time.
For context, depending on the size of the data and number of parameters three hours can be pretty state of the art in terms of performance. Always a good idea to separate out ideal performance verses reasonable performance.
@bbbales2’s suggestions will make a drastic improvement to your model, but also keep in mind that Gaussian processes with more than a few input dimensions, really any interpolation method in a space with more than a few dimensions, will typically perform poorly away from the input data. Very long HMC trajectories, and slow chains, can be a symptom of the model trying to do too much.