Specify absolute deadlines, not relative timeouts, part 2
In the last post, I advocated the use and design of blocking interfaces based on
deadlines rather than timeouts. I finally found the POSIX snippet that I was
thinking of: it’s in the specification for pthread_cond_timedwait
. The post also got
me two very interesting comments.
One reader noted that they had had a lot more trouble understanding and fixing performance issues in a highly-threaded program when they used deadlines instead of timeouts. I suppose that using deadlines based on something like wallclock-time can introduce yet another variable in an already hard-to-understand system. In the end, I still feel that’s more than outweighed by the consistency with respect to the rest of the world, especially given that timeouts can be constructed so easily on top of deadlines.
Another raised an important issue: which time should our deadlines be based on? The regular system clock, like UTC time, is affected by many events like leap seconds, ntp drift, the user changing the time to fool Farmville, …. In fact, it’s not even guaranteed to be monotonic: the system time can be moved backward. (Bonus points if you can determine how hardware suspend should be detected and/or handled.)
Of course, timeouts share all issues regarding nonmonotonic clocks, but I suppose most of us don’t think about it. POSIX usually specifies that timeouts must ignore time adjustments, or at least, backward ones, and linux/glibc seem to mostly achieve that by using a monotonic clock; it’s not clear how a portable program can achieve the same effect, though.
This blog post has an interesting overview of the issue, and a list of buggy programs and APIs that misuse wall-clock time.
The realtime POSIX extension includes a partial solution to this non-monotonic
clock issue: clock_gettime
, along with CLOCK_MONOTONIC
, gives access to time values
that never go backward. Unfortunately, that’s not always available (in particular, it’s
absent on Solaris, OS X and Windows); the author of the previously-mentioned
blog post also has a tiny portability wrapper to provide such time values on
POSIX platforms with CLOCK_MONOTONIC
, and on Solaris and OS X as well.
Still, platforms without clock_gettime
don’t necessarily expose blocking
calls with deadlines based on a monotonic clock either (e.g. OS X doesn’t
have one for its Mach semaphores), so only having a sane clock isn’t that
useful.
This isn’t only a theoretical or highly-improbable issue either; SBCL has had a
bug caused by timeouts for quite a while. The internals include this function, which
resumes nanosleep
ing when interrupted by signal handling.
(defun nanosleep (secs nsecs) (with-alien ((req (struct timespec)) (rem (struct timespec))) (setf (slot req ’tv-sec) secs (slot req ’tv-nsec) nsecs) (loop while (and (eql sb!unix:eintr (nth-value 1 (int-syscall ("nanosleep" (* (struct timespec)) (* (struct timespec))) (addr req) (addr rem)))) ;; KLUDGE: [...] #!+darwin (let ((rem-sec (slot rem ’tv-sec)) (rem-nsec (slot rem ’tv-nsec))) (when (or (> secs rem-sec) (and (= secs rem-sec) (>= nsecs rem-nsec))) ;; Update for next round. (setf secs rem-sec nsecs rem-nsec) t))) do (setf (slot req ’tv-sec) (slot rem ’tv-sec) (slot req ’tv-nsec) (slot rem ’tv-nsec)))))
On OS X, when nanosleep
is interrupted by a signal, the second argument is
updated by computing the time remaining in the timeout, once the signal handler
returns. Of course, that leads to an interesting situation when the subtracted time is
greater than the time to nanosleep
for (e.g. a signal handler consumes two seconds
before returning to a 1-second nanosleep
): the “remaining” timeout underflows into
a very long timeout.
Other platforms only subtract the time elapsed from the execution of
nanosleep
until the signal is received. At least, there’s never any underflow in
the “remaining” timeout, but that value, while always sane, is still pretty
much useless. If the loop is executed 5 times (i.e. nanosleep
is interrupted 5
times), and each signal takes 1 second to handle, the function be 5 second
late.
So, outside OS X, the “remaining” timeout computed by nanosleep
is subtly
useless. On OS X, it’s only even more subtly useless: we have the same problem when
a signal hits us between two calls to nanosleep
.
POSIX recommends the use of clock_nanosleep
if the issue above with
interruptions matters. In addition to being based on a deadline rather than a
timeout, it lets us specify which clock the deadline is based on. As usual, that’s not
available everywhere, so we’ll likely be stuck with a hard-to-trigger race condition in
SLEEP
on some platforms.