I won’t post the whole thing because there’s some quite lengthy functions, but here’s the core code:
data{
int<lower=1> ntrain; // Number of data points
int<lower=1> npred; // Number of predictions to make
vector[ntrain] y; // Data
int<lower=1> nx; // Number of x predictors
matrix[ntrain,nx] Xtrain;
matrix[npred,nx] Xpred;
int<lower=0,upper=1> enableRegressors; // Should separate ARD kernel be used for regressors
vector[ntrain] t; // Time periods for the training data
real<lower=0> timeLengthPrior; // Guess at the temporal autocorrelation length scale
// Outputs from fitted GP model
int nsim; // Number of simulations in the input data
real sigma[nsim]; // Observational noise
vector[nx] L_X[nsim]; // Length scale of the X variables
real A_ARD_X[nsim]; // Amplitude of the GP for the predictors
vector[nx] beta_X[nsim]; // Linear regression coefficients
real A_time[nsim]; // Amplitude of the temporal fluctuations
real l_time[nsim]; // Length scale of the temporal fluctuations
real A_RQ[nsim]; // Amplitude
real l_RQ[nsim]; // Length scale
real alpha_RQ[nsim]; // Alpha
real A_RQ_2[nsim]; // Amplitude
real l_RQ_2[nsim]; // Length scale
real alpha_RQ_2[nsim]; // Alpha
}
// ############################# Transformed data ##########################
transformed data{
real nugget = 1e-5; // Nugget to be added to covariance matrices to prevent ill conditioning
matrix[ntrain+npred, nx] X; // Combined Xfit and xpred
matrix[ntrain+npred, nx] X_rescaled; // Combined Xfit and xpred
vector[nx] Xtrain_colmin; // Means of columns of Xtrain - used for rescaling
vector[nx] Xtrain_colmax; // Means of columns of Xtrain - used for rescaling
vector[ntrain] y_standardised; // Normalise y to have mean zero and standard deviation one - will accelerate convergence
real ysd; // Standard deviation of the data
real ymean; // Mean of the data - used for mildly informative prior
// Combine Xtrain and Xpred into one matrix
for(k in 1:nx){
X[1:ntrain,k] = Xtrain[,k]; // Xfit
X[(ntrain+1):(ntrain+npred),k] = Xpred[,k]; // Xpred
}
// Calculate the scale of the X predictors
for(k in 1:nx){
Xtrain_colmin[k] = min(Xtrain[,k]);
Xtrain_colmax[k] = max(Xtrain[,k]);
}
// Rescale X to based on the Xfit min and max so that predictors lie in the range [-0.5, 0.5]
for(k in 1:nx){
for(i in 1:(ntrain+npred)){
X_rescaled[i,k] = (X[i,k]-Xtrain_colmin[k]+1e-9)/(Xtrain_colmax[k]-Xtrain_colmin[k]+1e-9)-0.5; // Need small number to avoid divide by zero errors if data all constant
}
}
ymean = mean(y);
ysd = sd(y);
// Standardise y data
for(i in 1:ntrain){
y_standardised[i] = (y[i] - ymean)/ysd;
}
print("...Done with transformed data...")
}
model{
}
generated quantities{
matrix[nsim, npred] ypred; // Predictions
print("Beginning generated quantities...")
{
for(p in 1:nsim){
{
vector[ntrain+npred] meanfunc; // Include the predictions for simplicity now
real sigma_sq_use; // Square of observational noise
matrix[ntrain, ntrain] K; // Covariance matrix for the training data
matrix[ntrain, npred] K_fitpred; // Covariance between the training data and preidctions
vector[npred] ypred_standardised;
print("Prediction iteration ",p," of ",nsim)
// Calculate the mean function
sigma_sq_use = square(sigma[p]); // Squared observational noise
meanfunc = X_rescaled*beta_X[p];
// Calculate covariance matrix for fit
K = calc_covariance(1, ntrain, 1, ntrain, L_X[p], X_rescaled[1:ntrain,],
A_time[p], l_time[p], t,
A_RQ[p], alpha_RQ[p], l_RQ[p],
A_RQ_2[p], alpha_RQ_2[p], l_RQ_2[p],
A_ARD_X[p], enableRegressors, 0);
// Calculate covariance for the cross terms
K_fitpred = calc_covariance(1, ntrain, ntrain+1, ntrain+npred, L_X[p], X_rescaled,
A_time[p], l_time[p], t,
A_RQ[p], alpha_RQ[p], l_RQ[p],
A_RQ_2[p], alpha_RQ_2[p], l_RQ_2[p],
A_ARD_X[p], enableRegressors, 0);
// Now predict the expected value of the GP
ypred_standardised = gp_pred_nonoise(K, K_fitpred, y_standardised,
meanfunc[1:ntrain], meanfunc[(ntrain+1):(ntrain+npred)], sigma[p], nugget);
for(i in 1:npred){
ypred[p,i] = ypred_standardised[i]*ysd+ymean; // Get back onto the original scale
}
}
}
}
}
The functions that sit under calc_covariance()
calculate the various covariance terms in the GP.
Anything with nsim
on it relates to parameters that are fitted through a separate model that shares very similar code. That is, I’m passing in the MCMC iterations from the same model fitted in different Stan code and then using the loop over nsim
to perform predictions at new points based on the posterior.
For reference, nsim=100
, ntrain=150
and npred=5000
.
(The same code and inputs with npred=5000
when run through the original NUTS sampling has no issues)