Introduction
When a thread encounters a fault condition, for example a segfault, execution is paused and the thread enters exception handling. Handlers that have registered to receive these exceptions are notified and given a chance to inspect or correct the condition.
This functionality is commonly used by debuggers or crash loggers, which want to have a chance to interact with threads before they would otherwise crash. For applications that just want to track task lifecycles without needing to intercept crashes, signals may be a better choice.
The Basics
Exceptions are handled from userspace by creating an exception channel on a
task (thread, process, or job) with the zx_task_create_exception_channel()
system call. The created handle is a standard Zircon
channel, but is created read-only so can only be used
for receiving exception messages.
When an exception occurs, the thread is paused and a message containing a
zx_exception_info_t
and an exception handle is sent to the channel. The
lifetime of the exception is bound to the lifetime of this exception handle, so
when the receiver is done processing, closing this exception handle will resume
the exception. This exception handle is non-copyable, meaning that at any given
time, there is only one handler for this exception.
By default, closing an exception handle will keep the thread paused and send
the exception to the next handler. If the receiver has corrected the exception
and wants the thread to resume execution instead, it can change the exception
state to ZX_EXCEPTION_STATE_HANDLED
via zx_object_set_property()
before
closing.
Exception Handles
Exception handles behave similarly to suspend tokens by keeping the thread paused until they are closed. Additionally, exception handles have functions to help receivers process the exception:
zx_object_set_property()
withZX_PROP_EXCEPTION_STATE
to set behavior on handle closezx_exception_get_thread()
to get a handle to the exception threadzx_exception_get_process()
to get a handle to the exception process (process or job exception channels only)
Task handles retrieved from exceptions will have the same rights as the task
originally passed into zx_task_create_exception_channel()
.
Example
This simple example creates an exception channel and loops reading exceptions until the task closes.
void ExceptionHandlerLoop(zx_handle_t task) {
// Create the exception channel.
uint32_t options = 0;
zx_handle_t channel;
zx_status_t status = zx_task_create_exception_channel(task, options,
&channel);
// ... check status ...
while (true) {
// Wait until we get ZX_CHANNEL_READABLE (exception) or
// ZX_CHANNEL_PEER_CLOSED (task terminated).
zx_signals_t signals = 0;
status = zx_object_wait_one(channel,
ZX_CHANNEL_READABLE | ZX_CHANNEL_PEER_CLOSED,
ZX_TIME_INFINITE, &signals);
// ... check status ...
if (signals & ZX_CHANNEL_READABLE) {
// Read the exception info and handle from the channel.
zx_exception_info_t info;
zx_handle_t exception;
status = zx_channel_read(channel, 0, &info, &exception, sizeof(info), 1,
nullptr, nullptr);
// ... check status ...
// Send the exception out to some other function for processing, which
// returns true if the exception has been handled and we can resume the
// thread, or false to pass the exception to the next handler.
bool handled = process_exception(info, exception);
if (handled) {
uint32_t state = ZX_EXCEPTION_STATE_HANDLED;
status = zx_object_set_property(exception, ZX_PROP_EXCEPTION_STATE,
&state, sizeof(state));
// ... check status ...
}
// Close the exception to finish handling.
zx_handle_close(exception);
} else {
// We got ZX_CHANNEL_PEER_CLOSED, the task has terminated.
zx_handle_close(channel);
return;
}
}
}
Exception Types
At a high level there are two types of exceptions: architectural and synthetic. Architectural exceptions are things like a segfault (e.g., dereferencing the NULL pointer) or executing an undefined instruction. Synthetic exceptions are things like thread start/stop notifications or policy violations.
Architectural and policy exceptions are considered fatal, and will cause the process to be killed if they are unhandled. Debugger-only exceptions - thread start/stop and process start - are informational and will continue execution normally even if the thread isn't explicitly resumed. These exceptions are meant to give a debugger a chance to react to these lifetime events correctly, as the corresponding thread will be paused until the exception is resumed.
Exception types are defined in
<zircon/syscalls/exception.h>
.
Exception Channel Types
Exception channels have different characteristics depending on the task type and
whether the ZX_EXCEPTION_CHANNEL_DEBUGGER
flag is passed to
zx_task_create_exception_channel()
. The table below summarizes the
differences between the various channel types:
Channel Type | get_thread |
get_process |
Architectural & Policy Exceptions | Thread Start/Stop Exceptions | Process Start Exception |
---|---|---|---|---|---|
Thread | X | X | |||
Process | X | X | X | ||
Process Debugger | X | X | X | X | |
Job | X | X | X | ||
Job Debugger | X | X | X |
The channel type also determines the order in which exception channels will be given the chance to handle an exception:
- process debugger
- thread
- process
- process debugger (optionally, if the exception is
'second-chance'
) - job (parent job -> grandparent job -> etc)
If there are no remaining exception channels to try, the kernel terminates the
process as if zx_task_kill()
was called. The return code of a process
terminated by an exception is ZX_TASK_RETCODE_EXCEPTION_KILL
, and can be
obtained with zx_object_get_info()
using ZX_INFO_PROCESS
.
Each task only supports a single exception channel per type, so for example given a process with a debug exception channel attached, trying to create a second debug exception channel will fail, but creating a non-debug channel will succeed.
ZX_EXCP_PROCESS_STARTING
and Job Debugger Channels
The ZX_EXCP_PROCESS_STARTING
behaves differently than other exceptions. It is
only sent to job debugger exception channels, and is always sent to all found
handlers, essentially assuming ZX_EXCEPTION_STATE_TRY_NEXT
regardless of the
actual handler behavior. This is also the only kernel-defined exception that
job debugger channels receive, making them a special-case handler for detecting
new processes. Job debuggers can also receive ZX_EXCP_USER
exceptions, which
can be raised using the zx_thread_raise_exception()
syscall.
Since job debugger channels are considered as "read-only", multiple job debugger
channels (up to ZX_EXCEPTION_CHANNEL_JOB_DEBUGGER_MAX_COUNT
) may be created on
a single job. When multiple debug channels are created on one job, a
ZX_EXCP_PROCESS_STARTING
event will be sent to all channels sequentially, with
the earlier created channels being notified before the later created channels.
User-defined Exceptions
The zx_thread_raise_exception()
syscall can be used to raise a user-defined
exception. These exceptions have the type ZX_EXCP_USER
, with user-defined
values delivered in the synth_code
and synth_data
fields of
zx_exception_context_t
. Currently, user-defined exceptions can be delivered
only to a job debugger exception channel.
Process Debugger First... and Possibly Again Later
In Zircon the process debugger exception channel is tried first. This is useful for at least a few reasons:
- Allows "fix and continue" debugging, e.g. if a thread gets a segfault, the debugger user can fix the segfault and resume the thread without any non-debugger channels seeing the exception.
- Ensures debugger breakpoints get sent directly to the debugger without other handlers having to explicitly pass them along.
If an exception has ZX_EXCEPTION_STRATEGY_SECOND_CHANCE set and remains unhandled after the process exception channel is tried, the process debugger exception channel will be given a second chance. The utility of this lies in the case in which a process listens to its own exceptions and uses that information for correct functioning; in this case, it serves to have debugger inspect in the event of a failed correction.
Interaction with Task Suspension
Exceptions and thread suspensions are treated separately. In other words, a thread can be both in an exception and be suspended. This can happen if the thread is suspended while waiting for a response from an exception handler. The thread stays paused until it is resumed for both the exception and the suspension:
zx_handle_close(exception);
zx_handle_close(suspend_token);
The order does not matter.
Interaction with Task Kill
zx_task_kill()
stops any exception handling on the task. If it is called on
a thread (or its parent process/jobs) while the thread is in an exception:
- the thread will stop waiting for the current exception handler
- no further exception handlers will receive the exception
zx_exception_get_thread()
andzx_exception_get_process()
on the outstanding exception handle will continue to provide valid task handleszx_object_set_property()
to set the exception's state will still returnZX_OK
, though the state won't have any effect since the thread is no longer blocking on the handler
Additionally, a killed thread will still send a ZX_EXCP_THREAD_EXITING
exception (if a process debug handler is registered), but as above will not
wait for a response from the handler.
Although zx_task_kill()
is generally asynchronous, meaning the thread may
not finish terminating by the time the syscall returns, it does synchronously
stop exception handling such that once it returns, closing an exception handle
will not resume the thread or pass the exception to another handler.
Signals
Signals are the core Zircon mechanism for observing state changes on kernel objects (a channel becoming readable, a process terminating, an event becoming signaled, etc).
Unlike exceptions, signals do not require a response from an exception handler. On the other hand signals are sent to whomever is waiting on the thread's handle, instead of being sent to the exception channel that could be bound to the thread's process.
A common pattern in Zircon is to have a message loop that waits for signals on
one or more objects and handles them as they come in. To incorporate exception
handling into this pattern, use zx_object_wait_async()
to wait for
ZX_CHANNEL_READABLE
(and optionally ZX_CHANNEL_PEER_CLOSED
) on the
exception channel:
zx_handle_t port;
zx_status_t status = zx_port_create(0, &port);
// ... check status ...
// Start waiting on relevant signals on the exception channel.
status = zx_object_wait_async(exception_channel, port, kMyExceptionKey,
ZX_CHANNEL_READABLE | ZX_CHANNEL_PEER_CLOSED, 0);
// ... check status ...
// ... add other objects to |port| with wait_async() ...
while (1) {
zx_port_packet_t packet;
status = zx_port_wait(port, ZX_TIME_INFINITE, &packet);
// ... check status ...
if (packet.key == kMyExceptionKey) {
if (packet.signal.observed & ZX_CHANNEL_READABLE) {
// ... extract exception from |exception_channel| and process it ...
// wait_async() is one-shot so we need to reload it to continue
// receiving signals.
status = zx_object_wait_async(
exception_channel, port, kMyExceptionKey,
ZX_CHANNEL_READABLE | ZX_CHANNEL_PEER_CLOSED, 0);
// ... check status ...
} else {
// Got ZX_CHANNEL_PEER_CLOSED, task has terminated.
zx_handle_close(exception_channel);
}
} else {
// ... handle other objects added to |port| ...
}
}
Comparison with Posix (and Linux)
This table shows equivalent terms, types, and function calls between Zircon and Posix/Linux for exceptions and the kinds of things exception handlers generally do.
Zircon | Posix/Linux |
---|---|
Exception/Signal | Signal |
ZXEXCP* | SIG* |
zx_task_create_exception_channel() | ptrace(ATTACH,DETACH) |
zx_task_suspend() | kill(SIGSTOP),ptrace(KILL(SIGSTOP)) |
zx_handle_close(suspend_token) | kill(SIGCONT),ptrace(CONT) |
zx_handle_close(exception) | kill(SIGCONT),ptrace(CONT) |
zx_task_kill() | kill(SIGKILL) |
zx_thread_raise_exception() | kill(SIGUSR1) |
N/A | kill(everything_else) |
TBD | signal()/sigaction() |
zx_port_wait() | wait*() |
various | W*() macros from sys/wait.h |
zx_exception_info_t | siginfo_t |
zx_exception_context_t | siginfo_t |
zx_thread_read_state() | ptrace(GETREGS,GETREGSET) |
zx_thread_write_state() | ptrace(SETREGS,SETREGSET) |
zx_process_read_memory() | ptrace(PEEKTEXT) |
zx_process_write_memory() | ptrace(POKETEXT) |
Zircon does not have non-fatal asynchronous signals like SIGINT
, SIGQUIT
, SIGTERM
,
SIGUSR1
, SIGUSR2
, and so on.
Another significant difference from Posix is that in Zircon a thread cannot handle its own exceptions, since Zircon exception handling is a synchronous operation driven by userspace rather than an asynchronous callback invoked by the kernel.
Examples
Zircon code that uses exceptions can be viewed for further examples, including:
src/bringup/bin/pwrbtn-monitor/crashsvc
: system-level crash handlersystem/utest/exception
: exception unit testssystem/utest/debugger
: debugger-related functionality unit tests
See Also
zx_task_create_exception_channel()
zx_exception_get_thread()
zx_exception_get_process()
zx_object_set_property()
zx_port_wait()