Great work! Thanks for the visualizations, too! I really appreciate this!
yeah, this is as expected. Did you calculate ratio of total run time of all of these functions with the total program? I’m curious usually cholesky decomp is taking up a lot of time but it looks like in our Stan programs a lot more time is being taken up elsewhere, i.e. matrix multiplication and some of the RNG’s.
In the meantime, I’m attaching an ARD model for the exponential quadratic kernel, as in this paper: Model selection for Gaussian processes utilizing sensitivity of posterior predictive distribution.
I think they use a reduced size dataset for some of the experiments. Some times with with ARD the sampling is faster because the model is more flexible (kind of the same effect on sampling as adding more flexibility with hierarchical priors), so I’d be interested in seeing total execution time. Also, the ARD kernel will probably be slower because it’s not specialized for reverse more autodiff, so id be interested in comparing w a non ARD model.
I’m attaching…
- the ARD model,
concrete.stan
that works for all of these…
- 4 datasets: concrete, boston housing, automobile, crime, and then some …
prim
code for the ARD (that isn’t merged into dev yet and is cued up), and then the…
- function signatures code. I can format the data for you, if you really want, just ping me.
Any reason this hasn’t been added? I know you asked for a computation of a distance matrix that calculates the lower triangular (because ditsance) is being called over and over, but the PR was up for a long time and I was just trying to push it through. Is this not up to standards? Honestly, as I’m writing this, probably best to implement a distance function because it’s easier to edit the kernels now…
concrete.stan (447 Bytes)
Concrete_Readme.txt (3.7 KB)
concrete.hpp (16.6 KB)
format_data.R (610 Bytes)
concrete_input.data.R (48.5 KB)
Concrete_Data.csv (56.9 KB)
housing_input.data.R (40.7 KB)
housing_readme.txt (2.0 KB)
imports-85_readme.txt (4.6 KB)
imports-85.txt (25.3 KB)
housing.txt (47.9 KB)
cov_exp_quad.hpp (8.1 KB)
temp.txt (2.0 KB)
I think for two of these datasets it’s formatted…
and then next I’ll send you the largest sum with N=7200 I can get to work after I instrument the code…
Is there anyway you can parsimoniously share some of the code/how you’re testing run time?
Another interesting case would be a Student-t likelihood (sampling from it), a model of which would look something like this:
model {
parameters {
real<lower=0> magnitude;
real<lower=0> length_scale;
real a;
vector[N_tot] eta;
}
transformed parameters {
vector[N_tot] f;
matrix[N_tot, N_tot] L_K;
{
matrix[N_tot, N_tot] L_K;
matrix[N_tot, N_tot] K;
K = cov_exp_quad(x_tot, magnitude, length_scale);
for (n in 1:N_tot)
K[n, n] = K[n, n] + 1e-12;
L_K = cholesky_decompose(K);
f = L_K * eta;
}
}
model {
magnitude ~ normal(0, 1);
length_scale ~ inv_gamma(5, 5);
eta ~ normal(0, 1);
y ~ student_t(3, f, 2);
}
Not sure if this is accurate or will compile, but I can put something together.