RFC-0240: Asynchronous operations are on objects | |
---|---|
Status | Accepted |
Areas |
|
Description | Defines how asynchronous operations in Zircon interact with object and handle operations. |
Gerrit change | |
Authors | |
Reviewers | |
Date submitted (year-month-day) | 2024-02-06 |
Date reviewed (year-month-day) | 2024-03-11 |
Summary
This proposes that asynchronous operations in the Zircon kernel are performed on kernel objects and are not tied to the handle used to register those operations.
Motivation
Zircon defines asynchronous waiting operations on kernel objects. We would like to add asynchronous operations that have side effects such as asynchronous channel reads. In order to do so, we need to nail down the semantics of these operations and their interactions with other operations on the handles and/or objects involved.
object_wait_async is an example of an asynchronous operation that generates a port packet when an object's state changes. The current API tries to split the difference between the operations being performed on the underlying object - as the name would suggest - and the handle identifying the operation.
Stakeholders
Zircon
Facilitator:
The person appointed by FEC to shepherd this RFC through the RFC process.
Reviewers:
- maniscalco@google.com
- cpu@google.com
- abarth@google.com
Consulted:
Socialization:
Requirements
Define the relationship of asynchronous kernel operations including object_wait_async and handle interactions such as transferring and closing handles.
Design
Overview
Asynchronous operations in Zircon operate on objects (not handles). Changes in the object state may influence these operations. Changes in the handles used to refer to the objects do not influence the behavior of asynchronous operations, with a carveout for channels specifically described below. Once registered, an asynchronous operation continues until it completes, is canceled explicitly, or all handles to the object are closed. The facility for canceling asynchronous operations on a port is updated to reflect the new semantics.
Syscall changes
object_wait_async
:
- Remove the sentence "If handle is closed, the operations associated with it will also be terminated, but packets already in the queue are not affected." from the documentation and backing code.
Once registered, an asynchronous wait remains valid until the object state changes to match the observed condition, the wait is explicitly canceled, or all handles to the object are closed.
port_cancel
:
- Deprecate
port_cancel
and provideport_cancel_key
as a replacement that cancels all operations on the specified port using the specified key. Signature:
zx_port_cancel_key(zx_handle_t port, uint32_t options, uint64_t key)
This is also an opportunity to add an options
parameter which port_cancel
currently lacks.
An alternative would be to update the implementation of port_cancel
to ignore
the source
parameter. In practice cancelable port operations are registered
with a unique key per operation. Typically this key is the address of a data
structure allocated to handle the operation itself. There is evidence to suggest
that this change would be safe (see the "Backwards Compatibility" section) but
it would be challenging to deploy this change safely and reclaim the options
slot while using the same name. Using a new name for the operation will allow
us to update code incrementally and track callers of the old operation.
object_wait_one
and object_wait_many
with handle_close
:
- No change in explicitly documented feature set. Document that closing a handle during a synchronous wait is deprecated.
These are synchronous waits on one or more objects identified by handle. We
currently document that closing a handle cancels any currently pending waits
through these operations, but this is inherently racy since there is no way for
one thread closing a handle to know if another thread calling
object_wait_{one,many}
has resolved that handle yet. A program racing
handle_close
and object_wait_{one,many}
against each other could observe
either ZX_ERR_CANCELED
or ZX_ERR_BAD_HANDLE
. We can provide a race-free way
to cancel these synchronous wait operations by dissociating the handle from the
object (https://fxrev.dev/949517) or by making it easier for programs to wait
for multiple conditions so that they can synchronize access to the handle in
userspace. Since it is currently ambiguous when in the operation the handle
needs to be valid, this proposal does not change the behavior.
channel_call
and channel_call_etc
:
- No change in explicitly documented feature set.
These currently have implicit documentation of the same cancelation behavior as
object_wait_{one,many}
in the ZX_ERR_CANCELED
return value. They also have
the same sort of race issue as those operations when using handle_cancel
in
the general case, although since this operation exposes more internal details it
is possible to concoct scenarios where close will not race in practice. The
internal wait of channel_call
cannot be parameterized by callers beyond a
timeout so programs cannot use multiple conditions to coordinate shutdown. To
provide race-free cancelation we'll need to define a separate operation such as
the one proposed in https://fxrev.dev/949517. As with object_wait_{one,many}
while the current behavior is suspect this RFC does not propose any changes to
the behavior of these operations.
Channel ordering
Channels serve a unique role in the Fuchsia system architecture and currently have special case logic in the kernel to verify that operations on a channel handle only succeed when the handle is owned by the calling process. Channels are used extensively in the system to exchange data and capabilities. The purpose of these checks is to maintain confidentiality and integrity on channels when handles are passed between processes operating at different trust levels. Once a process gains possession of a handle to a channel it is promised exclusive access to read and write messages from that endpoint. To preserve this property with the semantics proposed in this RFC and allowing for future asynchronous channel operations we can take advantage of another property of channels which is that channel endpoints have exactly one handle. A transfer operation on a handle to a channel can be considered an operation on the object itself and we can either cause pending mutation operations on the channel or the transfer attempt to fail.
Observation of object state such as the READABLE and WRITABLE signals does not
violate properties that we care about. It is acceptable if a program can observe
when a channel is readable or writable even after transferring this handle away
as this information does not carry capabilities or important confidential
information about the state of the system. This means that the
object_wait_async
operation does not need to behave differently for channels
with this proposal. Future operations on channels such as asynchronous reads
will need to consider interactions with transfer.
Implementation
The main practical impact of this change on userspace is on cancelation of
asynchronous object waits on port objects. Asynchronous dispatcher libraries
will need to allocate and track unique keys for cancelable operations, which
they already do, and will no longer need to track handle values if they are
not using the objects. Calls to port_cancel
will be translated mechanically
to port_cancel_key
by changing the second parameter from a handle value to
the literal 0
.
On the kernel side, the relationship between observers, ports, and the object
representation is very subtle. This change will alleviate some complexity by
removing a dependency on the handle table lock during the cancelation
procedure and enable future refactors of the handle managing logic. In the
short term, the most straightforward way to implement this change is to
register asynchronous waits with a null Handle
and perform asynchronous
wait cancelation by marking the registered waits as canceled and cleaning
up the observer state when the observer fires.
Performance
This change reduces the amount of kernel bookkeping on asynchronous operations and the stress on the handle table lock which may help under certain types of load. Our current microbenchmarks show no significant change with a simple prototype implementation. Userspace libraries can also store less information in order to cancel a registered asynchronous operation.
Ergonomics
This requires applications track key values if they wish to cancel individual asynchronous operations. In practice dispatcher implementations do this already (in addition to keeping a handle value).
Backwards Compatibility
This changes the documented and actual semantics for object_wait_async
and
port_cancel
in subtle ways. The changes do not impact programs that maintain
these properties:
- Asynchronous waits are registered with unique keys per wait
- Handles to objects being waited on are not closed or transferred with registered waits still pending
These are generally true today. Asynchronous waits are usually associated with
an allocation of state to handle the result of the wait and the key
for these
waits is usually computed based on the address of this allocation which is
necessarily unique within the address space or upon another form of unique
identifier. Handles are usually stored in owning data structures and closing the
handle requires going through some sort of shutdown procedure that clears
references to it.
https://fxrev.dev/984701 is a prototype that ignores the source
parameter to
port_cancel
. There is one Zircon core test that reuses keys intentionally,
all other test cases in the tree pass unmodified. This suggests that the
changes to port_cancel
are compatible with current code.
https://fxrev.dev/986494 is a related prototype that decouples the lifetime of a registered handle with the observer. This evaluates whether we rely on closing a handle to cancel outstanding asynchronous waits. All of our existing tests pass unmodified with this change suggesting that this change in behavior is compatible with current code.
Security considerations
This proposal allows processes to observe changes in object state that they no longer have a handle to in some situations. In this model, a handle represents the capability to initiate an asynchronous operation. We do not in general promise that a handle represents exclusive access to an object outside the special case of objects with non-duplicatable handles (channels, most notably).
For most object types that grant duplicatible handles a process that possesses a handle cannot reason about what other handles and operations on that handle exist unless it has knowledge of the full history of that handle back to creation or transfer from a trusted source. This analysis does not change significantly if we allow asynchronous operations to persist on an object after a handle to it is transferred as the registrar could just as easily have retained a duplicate handle.
For channels specifically as we always grant non-duplicatible handles we do promise exclusive access to a process possessing a handle to a channel. Allowing a previous owner of a channel handle to observe when a channel first becomes readable after transferring it away is likely not security sensitive. We'll need to be careful when introducing operations that interact with the contents of messages sent through channels such that only current handle owners are permitted to perform these operations.
Testing
The changes to port and asynchronous object behavior will be tested with
Zircon's core test suite. Library compatibility - particularly in dispatcher
libraries - will be evaluated through inspection of the libraries and their test
suites. Our existing integration suites pass with a prototype of the new
behavior (see the "Backwards Compatibility" section) and we can monitor these
closely when deploying the change to object_wait_async
.
Documentation
System call documentation will need to be added for the new port_cancel_key
and documentation for existing operations will need to be updated as described
in the design section.
Drawbacks, alternatives, and unknowns
Alternative semantics based around handles
One alternative is to continue to tie asynchronous operations to their initiating handle (as some of the current documentation suggests) and maintain this relationship as we add new asynchronous operations. This will add complexity to these new operations and make it more difficult to refactor and optimize the kernel's handle managing logic while providing little utility to application logic.
port_cancel
reuse
Instead of defining a new port_cancel_key
syscall we could replace the
source
parameter on the existing port_cancel
with an options
parameter of
the same size. For a transition period, no options would be supported and
handle values would also be permitted in this field. During this transition
period we would find and update all callers to port_cancel
to pass a 0 value
in this field instead of the handle value they currently supply. At the end of
the transition period we would start enforcing that the field be zero, at which
point we could start introducing option values. A challenge with this approach
is that it may be difficult to detect when all callers are updated to the new
semantics. It is rare in practice to cancel an asynchronous wait before it fires
and so dynamic analysis of port_cancel
may not find callers that are only
reached in unusual error handling or timing situations.
Prior art and references
Zircon handle resolution model - draft RFC from 2020 exploring some of these issues.
zx_handle_cancel - draft RFC defining a mechanism for canceling synchronous operations on a handle avoiding handle resolution races.