I am trying to measure the time it takes for one gradient estimation of my model. I tried to use the profiling functionality and measure the time for the whole model evaluation as follows:
data {
...
}
parameters {
...
}
transformed parameters {
...
}
model {
profile("model") {
...
}
}
However, this does not seem to measure the computations in the transformed parameters
block. When executing the model with CmdStan I get
Gradient evaluation took 0.003032 seconds
1000 transitions using 10 leapfrog steps per transition would take 30.32 seconds.
Adjust your expectations accordingly!
Iteration: 1 / 2000 [ 0%] (Warmup)
...
Elapsed Time: 148.001 seconds (Warm-up)
196.874 seconds (Sampling)
344.875 seconds (Total)
So the total sampling time is around 344 seconds but in the profile.csv
block it shows that the time spent in the model
block was only around 10 seconds which makes me assume that it doesn’t measure the whole logjoint computation. Also for this specific model I would expect that the most expensive computations are in the transformed parameters
block because I am computing the covariance matrices of a sparse GP in that block.
What is the most straightforward way to measure the time for one gradient evaluation? I know there is an estimate given by CmdStan at the beginning of the terminal output but I’d like to have a way to measure the time over multiple executions and then save the measurements to a file. Is there a way to do that?
Hey Tim,
there is no reason you could not measure time in the transformed parameters block as well. You do have to declare the variables before the profile block, which is slightly annoying, but can be done.
Example:
transformed parameters {
matrix[N, N] L;
profile("cholesky") {
L = cholesky_decompose(...);
...
}
}
2 Likes
Cool thanks! Just to make sure I understand correctly: I then should simply add up the timings for model
and cholesky
from the profile.csv
file?
One reason for asking was wondering whether there is a “simpler” solution that I wasn’t aware of and making sure I get the correct measurements. Thanks a lot!
Yeah, or just name both profile sections with the same name, that way they will be added up automatically.
A minor note here is that in addition to the gradient evaluations, there is one additional non-gradient evaluation of the transformed parameters block per sample.
So if you produce 1000 samples there will be 1000 additional evaluations in which just the function evaluations happen (no gradients are computed). But there are typically 10k-100k or more gradient evaluations so that time is typically negligible.
Yeah, or just name both profile sections with the same name, that way they will be added up automatically.
Perfect!
A minor note here is that in addition to the gradient evaluations, there is one additional non-gradient evaluation of the transformed parameters block per sample.
Out of interest, is that because the leapfrog integrator only uses the gradients but then for the accept/reject step it needs the actual value of the logjoint?
The actual values are computed always, they are just not stored for each HMC iteration, rather only evaluated once a new sample is “selected”. See Request for final feedback: User controlled unnormalized (propto) distribution syntax - #12 by betanalpha
So this is just trading a lot of memory savings for an additional function evaluation (without gradient evaluations).
1 Like