Precision of Timing Estimates

@rok_cesnovar, in the recent changes to how time differences are computed in the services did you end up using a new base unit of time? The gradient evaluate time estimates are now returning zero (CmdStan, develop) which makes me think that the time differences are being computed in a unit without enough precision (we’d need at least microseconds for the gradient evaluation time estimates). Just wanted to check before creating an issue. Thanks!

Indeed, it was set to miliseconds for all timings. Will switch to a smaller unit for the estimator or maybe even all timings. Its definitely bad for the estimator, that did not occur to me.

Thanks for spotting this!

We went with millisecond accuracy. The rounding can be changed to microseconds, but I doubt that there is any guarantee for all of that to work. It will likely work on most of our platforms though.

What is the rationale for this level of accuracy? Maybe we can avoid the rounding for whatever you want to do.

I am still not sure why you think that there is no guarantee that it wont work.

From cppreference:

    class Rep,
    class Period = std::ratio<1>
> class duration;

It consists of a count of ticks of type Rep and a tick period, where the tick period is a compile-time rational constant representing the number of seconds from one tick to the next.

So (count of ticks) * (number of seconds for one tick) = time in seconds. Why do you think there is no guarantee?

The other point is that std::chrono::steady_clock is part of C++ since C++11 and if this would not work we would at least see some stack overflow questions or what not in the ~10 years. I searched a ton when you mentioned this on the PR.

I think not rounding the duration in all places would be the best solution.

I also searched a lot and what I found is that people say that what exactly is one “tick” is simply not defined… and I also can never find this clearly defined in the C++ standard docs.

The only reference to that a tick is a second is the default argument to the template parameter which you quote - but I don’t see that as a proper definition in the sense of spelling out a standard.

The solution we have is good as it gives us well defined accuracy. We should just go to a higher precision of microseconds if that solves the issue. That’s my take on this (I can be wrong, of course).

Does it really matter what t is in:

X t * \frac{Y s}{t}

We are interested in X*Y in seconds. What we currently do is:

double sample_delta_t
      = std::chrono::duration_cast<std::chrono::milliseconds>(end - start)
        / 1000.0;


std::chrono::duration_cast<std::chrono::milliseconds>(end - start).count()

gets us X*Y*1000.

std::chrono::milliseconds is just std::ratio (std::milli) that multiplies by 1000. And we then do X*Y*1000/1000.0. It does spell out milliseconds in code though. That is a plus, I guess.

I am fine with replacing 1000 with 1000000, but feel its unecessary and would rather not use casting at all. But just want to fix this either way.

Not sure what X and Y is… but we have the fundamental problem that the standard does not define what 1 tick is equal to. The C++ docs are only implicitly saying that the physical time unit used for counting is 1 second - it is not explicitly saying so. Thus, the safest way is to convert to a well defined time unit using the pre-defined facilities of the standard.

EDIT: We could change how we report things. Instead of writing “seconds” we write “ticks”. That would be the unit which we are guaranteed to get - whatever those ticks are on the platform is unspecified.

X is the number of ticks we measured and Y is the compile time constant representing the number of seconds in a tick.

So if we measure 20 ticks and we know Y is 0.0005 s per tick not sure why it matters what a tick is. X*Y is what duration returns.

We are getting into a discussion we already had, I think…

anyway, I just checked and. @betanalpha is right… as we multiply by 1E3 the time difference and want to present sub-second precision for that number, we just need micro seconds and then we are all set.

Let’s just do that - this will solve the matter just fine… and if it looks so ugly in the code we can introduce a utility function handling it?

EDIT: Though we should check if we don’t overflow when Stan runs for days…

@rok_cesnovar is absolutely correct here regarding the implementation of std::chrono::duration – the documentation is clear and unambiguous. The whole point of std::chrono is to abstract away the concept of “ticks” or “cycles” or however time is actually tracked on the hardware. Instead it defines clocks that manage that progression behind the scenes and durations that translate to differences in times into the number of periods of a certain length that have elapsed.

In particular

std::chrono::duration_cast<std::chrono::milliseconds>(end - start).count();

returns the number of millisecond-long periods that have elapsed between start and end. Exactly how that is calculated by the standard library is irrelevant.

If we need sub-second precision then we just use the pattern @rok_cesnovar already uses. For millisecond precision we could the number of millisecond periods that have elapsed and then convert back to seconds,

std::chrono::duration_cast<std::chrono::milliseconds>(end - start).count() / 1000.0;

For microsecond precision we would use

std::chrono::duration_cast<std::chrono::microseconds>(end - start).count() / 1e6;

Millisecond precision is fine for the total run times, but we should use microsecond precision for the gradient evaluation time estimates. In other words changing just one line.

@rok_cesnovar do you want me to create an issue?

No, we (@rok_cesnovar and myself) debate about a different point here - what is a “tick” is ill defined. The C++ specs ARE totally ambiguous about what a tick is. The docs suggest that one tick corresponds to the physical unit of a second, but that is not written in the docs at all (only implicitly to my eyes).

Anyway, we do seem to agree about the suggested path forward here. I do fully support that we go with

std::chrono::duration_cast<std::chrono::microseconds>(end - start).count() / 1e6;

That will do it in a well defined way… just one more sanity check should be done if this way of calculating may overflow when Stan runs for days.

Just create the issue please with the link to here, so I dont forget. And assign me please. will get on this by the end of the week. Thanks!

1 Like

As I wrote above there is no concept of a “tick” in the std::chrono library.

Instead there is a period defined as a template parameter in the duration type and duration_cast function which defines the unit of the intervals returned by the count method (the template parameter being a ratio of seconds, with std::milliseconds and std::microseconds being template aliases to ratio<1:1000> and ratio<1:1000000> respectively).
Once a duration type has been defined there is no ambiguity in what count returns.

That is all unambiguously defined by the interface. How the intervals are counted (say as a function of processor clock cycles) is considered as an implementation detail and won’t matter unless we start pushing precisions near the clock speed of the processor.

yup, perdios are defined in terms of seconds which you get when multiplying the count with the rep thing. The rep thing relates to ticks. What a tick is is left unspecified. It’s implicitly only defined to be a second.

Anyway, if we cast into the pre-defined std::microseconds or whatever thing, then everything is just fine.

You run into trouble when you use duration<double> as you then get vanilla ticks in a unit not defined as to what physical unit it is.

Fine @wds15 ,we will go with chrono::microseconds. It doesnt make sense to drag this anymore. Its such a small detail I dont want to waste more time on this since both fix this issue.

you are essentially saying that X is not fine for you, but X*1e6/1e6 is.

if someone else wants to review the PR and take the accountability (to some extent) for it… fine by me.

changing as discussed above is all fine for me and I am happy to approve.

again, sorry for my persistency, but that’s what is my job as reviewer as I understand it… and we found a solution here which is fine for all of us, I think.

@betanalpha no need for the issue. PR is open and approved. Thanks again for the report.

1 Like

Thanks for the lightning fast resolution!