Dear community,
Sorry for writing so long. Maybe this is not a GP question but rather due to my poor Stan coding ability. I am playing with GP following this great case study from @betanalpha https://betanalpha.github.io/assets/case_studies/gaussian_processes.html; stan manual 10.3 Fitting a Gaussian process | Stan User’s Guide and @avehtari’s HSGP paper [2004.11408] Practical Hilbert space approximate Bayesian Gaussian processes for probabilistic programming.
My practice is to estimate a 1D function f(x_{i}), i = 1,2..N_{f} which isconvolved (i.e., *) with a vector event_{j}, j = 1,2,...N_{obs}, so that the observed data y_{obs_{j}} = event *f + \sigma. Size N_{f} for function f is just 100, so not computationly expensive to model it using a exponentiated quadratic covariance function. Vector event_{j} has the same size as y_{obs_{j}}, N_{obs} is around 2500. Values in event_{j} is just binarized, so either 0 or 1. The apperaiance of 1 follows a known pattern (event[j==k]= 1; otherwise, event[j!=k] = 0, k is predefined). The idea is I simulated y_{obs} using y_{obs} = event *f + \sigma, with known f, event and \sigma. Then I’d like to model f by GP and estimate it from y_{obs} and event. Since I didn’t find a convolution fucntion that can be used inside Stan, I just toeplitzd vector event with an all 0 value vector that has the same size of f. This was done outside Stan code using Toeplitz function from the “pracma” package as following R code, the resultant matrix evt is just considered as data input to Stan model.
evt = t(Toeplitz(event, c(event[1], matrix(0, nrow = 1, ncol = length(f)-1))))
Then I can just do matrix multiplication of evt and f instead of convolution in the transformed parameter block. The model below (model #1) worked with good diagnostics and estimated well the shape of f. It took around 100s to finish the computation on my laptop. I can also mimic HSGP paper’s code (just for practicing to understand the method) to approximate the covariance function of f with m around 30 to speed up, although speed is not a big issue considering the size of f is just 100.
model #1
data{
int<lower=1> N_obs;
int<lower=1> N_f;
int<lower=1> N_noise;
vector[N_obs] y_obs;
matrix[N_f,N_obs] evt;
real x_f[N_f];
}
transformed data {
real delta = 1e-10;
}
parameters {
real<lower=0> rho;
real<lower=0> alpha;
real<lower=0> sigma_obs;
vector[N_f] eta;
}
transformed parameters {
vector[N_f] f;
vector[N_obs] pred_obs;
{
matrix[N_f, N_f] L_K;
matrix[N_f, N_f] K = cov_exp_quad(x_f, alpha, rho);
for (n in 1:N_f)
K[n, n] = K[n, n] + delta;
L_K = cholesky_decompose(K);
f = L_K * eta ;
pred_obs = evt' * f;
}
}
model {
//inv_gamma(9, 173); [10,50] , inv_gamma(4.6, 55.2); [5,50],
// inv_gamma(4.6, 22); [2,20], inv_gamma(7.3, 75);; [5,30], inv_gamma(4.6, 33.1); [3,30],
rho ~ inv_gamma(4.6, 33.1);
alpha ~ normal(0, 2);
sigma_obs ~ std_normal();
eta ~ std_normal();
y_obs ~ normal(pred_obs, sigma_obs);
}
However, for some reason, I tried to take this event as the input data and create this Toeplitz matrix evt inside the Stan code as shown in model #2, but the computation time got almost 10 times longer with the same results estimated by model #1. Toeplitz function was copied from the same function in the “pracma” package. The input and output should be exactly the same as I did outside Stan code evt = t(Toeplitz(event, c(event[1], matrix(0, nrow = 1, ncol = length(f)-1))))
. It also doesn’t involve any parameter in Toeplitz function, but just predefined values.
My question is do you think the reason for slowing down is the multiplication pred_obs = generate_evt(event, N_f, N_obs) * f
called Toeplitz function so many times? I meant every draw for f
will call this function once, although the function itself doesn’t involve any parameter to estimate. Is there any trick to only calculate evt once? I guess I should not put it in the transformed parameters block. However, the spectral density functions of different covariance functions are also called in transformed parameters in the birthday data basis_functions_approach_to_GP/CaseII_Birthday_Approximate-GP.stan at master · gabriuma/basis_functions_approach_to_GP · GitHub.
The reason why I’d like to do this Toeplitz function inside Stan is that in the next step I would simulate data with a new event vector in which event[k] = alpha[k] instead of event[k] = 1, then estimate both f and alpha[k] as parameters.
model #2
functions {
matrix Toeplitz(vector a, vector b){
int n = rows(a);
int m = rows(b);
matrix[n, m] T;
T[1, ] = b';
T[, 1] = a;
for (i in 2:n) {
T[i, 2:m] = T[i - 1, 1:(m - 1)];
}
return(T);
}
matrix generate_evt(vector event, int N1, int N2, int rec) {
matrix[N2, N1] evt;
evt = Toeplitz(event, append_row(event[1], rep_vector(0, N1-1)));
return evt;
}
}
data{
int<lower=1> N_obs;
int<lower=1> N_f;
vector[N_obs] y_obs;
vector[N_obs] event;
real x_f[N_f];
}
transformed data {
real delta = 1e-10;
}
parameters {
real<lower=0> rho;
real<lower=0> alpha;
real<lower=0> sigma_obs;
vector[N_f] eta;
vector[N_noise] cof_noise;
}
transformed parameters {
vector[N_f] f;
vector[N_obs] pred_obs;
{
matrix[N_f, N_f] L_K;
matrix[N_f, N_f] K = cov_exp_quad(x_f, alpha, rho);
// diagonal elements
for (n in 1:N_f)
K[n, n] = K[n, n] + delta;
L_K = cholesky_decompose(K);
f= L_K * eta ;
pred_obs = generate_evt(event, N_f, N_obs) * f;
}
}
model {
//inv_gamma(9, 173); [10,50] , inv_gamma(4.6, 55.2); [5,50],
// inv_gamma(4.6, 22); [2,20], inv_gamma(7.3, 75);; [5,30], inv_gamma(4.6, 33.1);; [3,30],
rho ~ inv_gamma(4.6, 33.1);
alpha ~ normal(0, 2);
sigma_obs ~ std_normal();
eta ~ std_normal();
cof_noise ~ normal(0, 10);
y_obs ~ normal(pred_obs, sigma_obs);
}
Thanks a lot,
zcai