I’ve been working on a project on-and-off about infering a latent time series. Since there are different “sensors” at work (binomial likelihood), I model potential group-specific effects as random intercepts. A really naive approach would be to fit using lme4::glmer with the model (1|group)+(1|measurement_time).
Since the latent time series is presumed to be smooth, I want instead to use a Gaussian Process prior to allow sharing of information across time. There are several R libraries that can fit this type of model, typically using the Laplace Approximation to the log marginal posterior.
Since my number of measurement times is actually relatively large, I am working with a Gaussian Predictive Process. Here are a few relevant papers (for now, I am not using the Finley correction):
- Gaussian predictive process models for large spatial data sets - PMC
- GitHub - mbjoseph/gpp-speed-test
- Improving the performance of predictive process modeling for large datasets - PMC
What confuses me about all the relevant stan examples that I have seen is the computation of the kernel. Everytime cov_exp_quad
is called, we end up rebuilding the distance matrix between all points. Is this not inefficient? Why not feed in the distance matrix as part of the data block and then call K=alpha^2 * exp(-0.5*D/rho)
to build the kernel covariance matrix?
For reference, my below model (using the standard procedure) seems to fit very slowly. Perhaps this is actually a more difficult model than I assumed or there are some code inefficiencies? I have 1127 observations measuring the process on 324 different dates. There are also 55 “groups” and I am using 48 predictive “knots”.
parameters {
vector[n_groups] bias;
vector[n_knots] knot_tilde;
real mu;
real<lower=0> lambda;
real<lower=0> tau;
real<lower=0> rho;
}
transformed parameters {
matrix[n_knots, n_knots] C_knot = add_diag(cov_exp_quad(z_knot, tau, rho), 1e-4);
matrix[n_obs, n_knots] C_predict = cov_exp_quad(z_predict, z_knot, tau, rho);
matrix[n_knots, n_knots] L = cholesky_decompose(C_knot);
vector[n_knots] knot = L * knot_tilde;
}
model {
y ~ binomial_logit(n, mu + mdivide_right_spd(C_predict, C_knot) * knot + bias[group_index]);
bias ~ normal(0, lambda);
knot_tilde ~ normal(0, 1);
rho ~ inv_gamma(5, 5);
tau ~ std_normal();
lambda ~ std_normal();
}
Any thoughts beside reducing the size of mdivide_right_spd(C_predict, C_knot) * knot
by making it mdivide_right_spd(C_predict, C_knot)[date_index] * knot
?