| RFC-0009: Edge triggered async_wait | |
|---|---|
| Status | Accepted |
| Areas |
|
| Description | The proposal is that, if the ZX_WAIT_ASYNC_EDGE flag is specified, the initial check is omitted and the signal set added to the interest list of the DispatchObject regardless of the initial signal state. In this mode of operation, one of the signals must transition from inactive to active for a packet to be queued on the supplied port (possibly requiring a signal to become inactive in the process). |
| Issues | |
| Gerrit change | |
| Authors | |
| Reviewers | |
| Date submitted (year-month-day) | 2020-10-24 |
| Date reviewed (year-month-day) | 2020-11-06 |
Summary
Waiting for signals to be asserted on an object is usually level-triggered and
a check is done at the start of zx_object_wait_async in case the signal is
already active, in which case a packet is immediately sent to the port.
This RFC concerns adding an option to zx_object_wait_async, ZX_WAIT_ASYNC_EDGE,
which does not perform that initial check and thus will only produce a packet
when the signal transitions from inactive to active after the call.
It may be that zx_object_wait_async is called with ZX_WAIT_ASYNC_EDGE with
the signals on the object already active. In this case, a packet will be queued
on the port only after the signal on the object becomes inactive and then
subsequently is asserted. In fact, this is how the ZX_WAIT_ASYNC_EDGE is commonly
used.
Motivation
The epoll polling mechanism in Linux can function in two modes - level-triggered
and edge-triggered. Fuchsia's waiting features, particularly zx_object_wait_async
and zx_port_wait already make level-triggered polling possible. However, edge-triggered
polling requires the ability to wait on a signal on an object that is already
active is expected (through I/O) to become inactive and subsequently
active again, queuing a packet on the port on this subsequent signal transition.
This is the intent of ZX_WAIT_ASYNC_EDGE.
Design
Implementation of the ZX_WAIT_ASYNC_EDGE flag of zx_object_wait_async is
fortunately quite simple.
At present, if one of the signal set is already active, the observer's OnMatch
method is called directly without any further action. Otherwise, if none of the
signal set is active, the set is added to the interest list of the DispatchObject
via the supplied SignalObserver.
The proposal is that, if the ZX_WAIT_ASYNC_EDGE flag is specified, the initial check
is omitted and the signal set added to the interest list of the DispatchObject
regardless of the initial signal state. In this mode of operation, one of the
signals must transition from inactive to active for a packet to be queued on
the supplied port (possibly requiring a signal to become inactive in the process).
Use of ZX_WAIT_ASYNC_EDGE in epoll with EPOLLET edge triggering
The main use of this change is to enable edge-triggered polling with the EPOLLET flag
in epoll. Waiting in Zircon differs from polling in epoll in that file descriptors are
added to an epoll file descriptor using epoll_ctl/EPOLL_CTL_ADD and are continually
monitored until removed with epoll_ctl/EPOLL_CTL_DEL. Zircon waiting, especially
with zx_object_wait_async, is always one-shot and the file object must be "re-armed"
by calling zx_object_wait_async again after a signal has become active.
Because epoll use must operate by repeatedly calling epoll_wait (without necessarily
calling epoll_ctl), this re-arming call to zx_object_wait_async must occur somewhere
in epoll_wait.
For the default level-triggered polling, in epoll_wait once zx_port_wait returns with
a signalled file object, we cannot call zx_object_wait_async before returning, because
the signal on that object is actve and will generate a duplicate packet on the port. Therefore,
a list of active level-triggered file descriptors is maintained and zx_object_wait_async
is called on file descriptors in this list on entering epoll_wait prior to waiting in
zx_port_wait.
For edge-triggered polling, after epoll_wait returns, non-blocking I/O should be
performed until EWOULDBLOCK is returned. At that point the signal on the file object will
be inactive. At this point epoll_wait should be called. However, if the signal on the file
object becomes active between the I/O operation returning EWOULDBLOCK and epoll_wait being
called, that event will be lost unless zx_object_wait_async has already been called.
It follows that, in edge-triggered mode, the call to zx_object_wait_async to re-arm
the file object must be called before epoll_wait returns. This is where ZX_WAIT_ASYNC_EDGE
is necessary. The call to zx_object_wait_async can be called with this flag between
zx_port_wait returning and epoll_wait returning, because although the signal is active
at this point, the ZX_WAIT_ASYNC_EDGE skips the check that the signals are active
(which they are at this point), so no packet is immediately queued on the port.
This means that when the I/O occurs until EWOULDBLOCK, the file object is already being
monitored by zx_object_wait_async and there is no gap in coverage.
Implementation
The addition of the ZX_WAIT_ASYNC_EDGE option to zx_object_wait_async has
already been implemented in fxr/438521
and its use in epoll has been implemented in
fxr/438656.
Performance
The performance impact will be negligible as only an extra added method parameter and a check for that parameter is added to the existing code.
Security considerations
N/A
Privacy considerations
N/A
Testing
Additional unit tests have been added.
Documentation
Documentation has been added to zx_object_wait_async in the implementing CL.
zx_object_wait_async
Drawbacks, alternatives, and unknowns
This appears to be the simplest way of implementing this feature and is analogous to how edge-triggered polling is implemented in other operating systems.
Care should be taken to not miss I/O events when using this flag.
A signal may become active after performing I/O and before making
a call to zx_object_wait_async, in which case the transition from
unsignalled to signalled may be missed. In practice, the ZX_WAIT_ASYNC_EDGE
flag is used immediately zx_port_wait has returned indicating that
the signal is active on the object. In this way, after non-blocking
I/O is performed on the object until the signal is inactive (usually
until ZX_ERR_SHOULD_WAIT is returned) zx_port_wait can be called to
wait until additional I/O is ready.
Because of the pattern in edge-triggered epoll_wait with EPOLLET of (1)
epoll_wait (2) non-blocking I/O until fd is not-ready (3) epoll_wait again,
an alternative to ZX_WAIT_ASYNC_EDGE would be to perform a check in every
I/O operation to see if the file descriptor has been added to an epoll file
descriptor via epoll_ctl with EPOLLET (and has ceased to be ready) and
re-arms a wait. This would require considerable modification to zxio and
fdio for a somewhat rare use case.
Prior art and references
The purpose of this change is to emulate the EPOLLET flag in Linux:
https://man7.org/linux/man-pages/man7/epoll.7.html