Why Fuchsia devices reboot

This document lists why a Fuchsia device may reboot. Some are self-explanatory while others require some additional context.

Outline:

Terminology

Ungraceful reboot

An ungraceful reboot is a reboot that is initiated by either the kernel in response to an error, such as a kernel panic, or performed by the hardware without software intervention, such as a hardware watchdog timeout.

Graceful reboot

A graceful reboot is a reboot that is initiated by a userspace process. The process may initiate the reboot in response to an error, like when a device’s temperature is too high, but Fuchsia should have the opportunity to undergo an orderly shutdown.

Reboot reasons listed

Kernel panic

If the kernel is unable to recover from an internal error, that error is considered fatal and the system will reboot.

The system runs out of memory

If the kernel detects that the amount of free physical memory falls below a threshold, the system will reboot. The kernel does not kill processes to try to reclaim memory before rebooting, meaning a single process could cause a system-wide shortage of memory and force the device to reboot.

Cold boot

If a device loses power for long enough between when it is shut down and it boots back up, the system will determine this to be a cold boot.

Brownout

A device browns out when its voltage dips below an acceptable threshold. This should only occur when there is an issue with a device’s power supply or its power related hardware.

Hardware watchdog timeout

Zircon sets up a hardware watchdog timer that will reboot the device if it is not reset within a specified period of time.

Software watchdog timeout

A software watchdog timer may reboot the device if someone sets one up.

Brief loss of power

If a device loses power for a short period of time, like when a user unplugs a device and rapidly plugs it back in, it may be unable to determine that the reboot was cold and will consider the reboot a result of a brief power loss. It is important to note that there is not a quantitative measure of what brief is and is hardware dependent.

User request

A user or a component acting on behalf of a user, such as SL4F or RCS, determines a reboot is necessary.

System update

A component responsible for system updates must update a package, or multiple packages, that cannot be updated ephemerally. These packages are canonically know as base packages.

Retry system update

A component responsible for system updates fails to apply an update, so the device reboots to try again (or possibly revert the update).

ZBI swap

If the Zircon boot image is swapped, the device reboots to apply the change.

High temperature

A component responsible for power management detects that a device's temperature is too high and the system cannot adequately reduce the device's temperature by throttling the CPU or reducing the audio volume.

Session failure

If the session manager is unable to restart a crashed session or a session determines it has failed in an unrecoverable manner, the device reboots.

Critical component failure

If a component marked "on_terminate": "reboot" crashed, the device reboots.

Factory data reset

Following a data reset to the factory defaults, the device reboots.

Root job termination

If the userspace root job is terminated, e.g., because one of its critical processes crashes, the device reboots.

Generic graceful

The platform can know whether the reboot was graceful, but cannot distinguish between a software update, a user request or some higher-level component detecting the device as overheating. All the platform knows is that the reboot was graceful.

Unknown

There are some scenarios in which the platform cannot determine the specific reboot reason nor can it determine if the reboot was graceful or ungraceful.

Where to find reboot reasons

Fuchsia exposes the reason a device last (re)booted through FIDL and tracks it on Cobalt and the crash server.

Culprits

Reboots that at are the result of an error in a specific component have crash signatures that attribute that component as the cause of the reboot. They follow a general pattern of combining the reboot reason and the component deemed responsible for the reboot, a.k.a. the culprit.

Reboot reason FIDL Cobalt event Crash signature
Kernel panic KERNEL_PANIC KernelPanic Function responsible for the crash, exactly like a userspace crash report
System running out of memory SYSTEM_OUT_OF_MEMORY SystemOutOfMemory fuchsia-oom or fuchsia-oom-$CULPRIT
Cold boot COLD Cold N/A*
Brownout BROWNOUT Brownout fuchsia-brownout
Hardware watchdog timeout HARDWARE_WATCHDOG_TIMEOUT HardwareWatchdogTimeout fuchsia-hw-watchdog-timeout
Software watchdog timeout SOFTWARE_WATCHDOG_TIMEOUT SoftwareWatchdogTimeout fuchsia-sw-watchdog-timeout
Brief power loss BRIEF POWER LOSS BriefPowerLoss fuchsia-brief-power-loss
User request USER_REQUEST UserRequest N/A*
System update SYSTEM_UPDATE SystemUpdate N/A*
Retry system update RETRY_SYSTEM_UPDATE RetrySystemUpdate fuchsia-retry-system-update
ZBI swap ZBI_SWAP ZbiSwap N/A*
High temperature HIGH_TEMPERATURE HighTemperature fuchsia-reboot-high-temperature
Session failure SESSION_FAILURE SessionFailure fuchsia-session-failure
Critical component failure CRITICAL_COMPONENT_FAILURE CriticalComponentFailure fuchsia-critical-component-failure or fuchsia-reboot-$CULPRIT-terminated
Factory data reset FACTORY_DATA_RESET FactoryDataReset N/A*
Root job termination `ROOT_JOB_TERMINATION RootJobTermination fuchsia-root-job-termination or fuchsia-reboot-$CULPRIT-terminated
Generic graceful graceful field set to true GenericGraceful fuchsia-undetermined-userspace-reboot
Unknown graceful field not set Unknown fuchsia-reboot-log-not-parseable

* Not a crash. \