Zircon has a microkernel style of design. A complexity for microkernel designs is how to bootstrap the initial userspace processes. Often this is accomplished by having the kernel implement minimal versions of filesystem reading and program loading just for the purpose of bootstrapping, even when those kernel facilities are never used after boot time. Zircon takes a different approach.
Boot loader and kernel startup
A boot loader loads the kernel into memory and transfers control to the kernel's startup code. The details of the boot loader protocols are not described here. The boot loaders used with Zircon load both the kernel image and a data blob in Zircon Boot Image format. The ZBI format is a simple container format that embeds items passed by the boot loader, including hardware-specific information, the kernel "command line" giving boot options, and RAM disk images (which are usually compressed). The kernel extracts some essential information for its own use in the early stages of booting.
BOOTFS
One of the items embedded in the Zircon Boot Image is an initial RAM disk filesystem image. The image is usually compressed using the zstd format. Once decompressed, the image is in BOOTFS format. This is a trivial read-only filesystem format that simply lists file names, and for each file the offset and size within the BOOTFS image (both values must be page-aligned both fields and are limited to 32 bits).
The primary BOOTFS image contains everything that the userspace system needs to run: executables, shared libraries, and data files. These include the implementations of device drivers and more advanced filesystems that make it possible to read more code and data from storage or network devices.
After the system has bootstrapped itself, the files in the primary
BOOTFS become the read-only filesystem tree rooted at /boot
(and served by
component manager).
Kernel loads userboot
The kernel does not include any code for decompressing zstd format, nor
any code for interpreting the BOOTFS format. Instead, all of this work
is done by the first userspace process, called userboot
.
userboot
is a normal userspace process. It can only make the standard
system calls through the vDSO like any other process would, and
is subject to the full vDSO enforcement regime.
What's special about userboot
is the way it gets loaded.
userboot
is built as an ELF dynamic shared object, using the
same RODSO layout as
the vDSO. Like the vDSO, the userboot
ELF image is embedded in the
kernel at compile time. Its simple layout means that loading it does
not require the kernel to interpret ELF headers at boot time. The
kernel only needs to know three things: the size of the read-only
segment, the size of the executable segment, and the address of the
userboot
entry point. At compile time, these values are extracted
from the userboot
ELF image and used as constants in the kernel code.
Like any other process, userboot
must start with the vDSO already
mapped into its address space so it can make system calls. The kernel
maps both userboot
and the vDSO into the first user process, and then
starts it running at the userboot
entry point.
Kernel sends processargs
message
In normal program loading, a bootstrap message is sent to each new process. The process's first thread receives a channel handle in a register. It can then read data and handles sent by its creator.
The kernel uses the exact same protocol to start userboot
. The kernel
command line is split into words that become the environment strings in the
bootstrap message. All the handles that userboot
itself will need, and
that the rest of the system will need to access kernel facilities, are
included in this message. Following the normal format, handle info
entries describe the purpose of each handle. These include
the PA_VMO_VDSO
handle.
userboot finds system calls in the vDSO
The standard convention for informing
a new process of its vDSO mapping requires the process to interpret the
vDSO's ELF headers and symbol table to locate system call entry points.
To avoid this complexity, userboot
finds the entry points in the vDSO
in a different way.
When the kernel maps userboot
into the first user process, it chooses
a random location in memory, just as normal program loading does.
However, when it maps the vDSO in it doesn't choose another random
location as is normal. Instead, it places the vDSO image immediately
after the userboot
image in memory. This way, the vDSO code is always
at fixed offsets from the userboot
code.
At compile time, the symbol table entries for all the system call entry
points are extracted from the vDSO ELF image. These are then massaged
into linker script symbol definitions that use each symbol's fixed
offset into the vDSO image to define that symbol at that fixed offset
from the linker-provided _end
symbol. In this way, the userboot
code can make direct calls to each vDSO entry point in the exact
location it will appear in memory after the userboot
image itself.
userboot decompresses BOOTFS
The first thing userboot
does is to read the bootstrap message sent by
the kernel. Among the handles it gets from the kernel is one with
handle info entry PA_HND(PA_VMO_BOOTDATA, 0)
. This is
a VMO containing the ZBI from the
boot loader. userboot
reads the ZBI headers from this VMO
looking for the first item with type ZBI_TYPE_STORAGE_BOOTFS
. That
contains the BOOTFS image. The item's ZBI header
indicates if it's compressed, which it usually is. userboot
maps in
this portion of the VMO. userboot
contains zstd format support code,
which it uses to decompress the item into a fresh VMO.
userboot loads the first "real" user process from BOOTFS
Next, userboot
examines the environment strings it received from the
kernel, which represent the kernel command line. If there is a string
userboot.next=
file+optional_arg1+optional_arg2=foo+... then file
will be loaded as the first real user process with the '+' separated
arguments passed to it. If no such option is present, the default file is
bin/component_manager+--boot
. The files are found in the BOOTFS image.
To load the file, userboot
implements a full-featured ELF program loader.
Usually the file being loaded is a dynamically-linked executable with a
PT_INTERP
program header. In this case, userboot
looks for the file
named in PT_INTERP
and loads that instead.
Then userboot
loads the vDSO at a random address. It starts the new
process with the standard conventions, passing it a channel handle and the
vDSO base address. On that channel, userboot
sends the
standard processargs
messages. It passes on all the important handles it received from the
kernel (replacing specific handles such as the process-self and thread-self
handles with those for the new process rather than for userboot
itself).
userboot loader service
Following the standard program loading protocol, when userboot
loads a
program via PT_INTERP
, it sends an additional processargs
message
before the main message, intended for the use of the dynamic linker. This
message includes a PA_LDSVC_LOADER
handle for a channel on which userboot
provides a minimal implementation of the
standard loader service.
userboot
has only a single thread, which remains in a loop handling
loader service requests until the channel is closed. When it receives a
LOADER_SVC_OP_LOAD_OBJECT
request, it looks up the object name prefixed
by lib/
as a file in BOOTFS and returns a VMO of its contents. Thus, the
first "real" user process can be (and usually is) a dynamically linked
executable needing various shared libraries. The dynamic linker, the
executable, and the shared libraries are all loaded from the same BOOTFS
pages that will later appear as files in /boot
.
An executable that will be loaded by userboot
(i.e. component manager
) should
normally close its loader service channel once it's completed startup.
That lets userboot
know that it's no longer needed.
userboot rides off into the sunset
When the loader service channel is closed (or if the executable had no
PT_INTERP
and so no loader service was required, then as soon as the
process has been started), userboot
no longer has anything to do.
If the userboot.shutdown
option was given on the kernel command line,
then userboot
waits for the process it started to exit, and then shuts
down the system (as if by the dm shutdown
command). This can be useful
to run a single test program and then shut down the machine (or emulator).
For example, the command line userboot.next=bin/core-tests userboot.shutdown
runs the Zircon core tests and then shuts down.
Otherwise, userboot
does not wait for the process to exit. userboot
exits immediately, leaving the first "real" user process in charge of
bringing up and taking down the rest of the system.