Out-of-tree component testing support

Project leads: shayba@google.com, crjohns@google.com
Area: Testing

Problem statement

Missing platform surface for testing

Software developers targeting Fuchsia can write components, build them, and test them. However these critical developer journeys are fully and continuously tested only for in-tree developers, and not at all for out-of-tree developers.

Today there are several teams that develop and test components out-of-tree. We sometimes refer to these teams as “partners”, because the Fuchsia team maintains a close engagement with them. This arrangement is costly to maintain and impossible to scale for the following reasons:

Out-of-tree testing relies on deprecated platform features, protocols, and tools. Most notably: using the SSH protocol to issue commands on the target (such as with fx shell or fssh), dash as a system interface, fx log to collect system-wide logs during the test, and using the SCP protocol (such as with fx scp) to collect other test artifacts as side effects on a global mutable filesystem. These produce unreliable behavior that leads to flakes and is difficult to troubleshoot, which can at least be partially attributed to the brittle nature of text-based protocols. Furthermore the transport affords for a single character stream, a great fit for a system-wide log (e.g. serial log or syslog) but not for multiple log streams or large binary artifacts such as state dumps and screenshots from tests.
Various text-based protocols that lack methods and practices for ensuring ABI stability or orchestrating ABI evolution.
Tests written as legacy components aka CFv1 which enjoy a lesser degree of isolation, suffer from a greater degree of flakiness, and don’t benefit from new testing tools.
Bespoke, inconsistent rules and scripts for defining tests and capturing their results and associated diagnostics.
Inconsistent support for various test frameworks. For instance, while all out-of-tree partners support C++ and GoogleTest, only some partners support Dart, and none support Rust despite its popularity for in-tree component development.
Inconsistent support for additional instrumentation that adds more value to tests, such as sanitizers and coverage.
Some tools and source code is distributed to partners not via the Fuchsia IDK, the set of SDK tools. For instance the TestWithEnvironment helper class is manually copied to the Flutter repository on GitHub to unblock integration testing needs.

To overcome these issues, the Fuchsia team has offered dedicated support to key partners. This arrangement often produces tailored solutions that aren’t portable between different customers. Furthermore, support for customer issues often happens inside the customer’s source repository, which may be convenient for the customer but doesn’t scale to supporting a general public audience of developers.

We now have a breadth of experiences and observations from the wild to inform us on how to create more generalized testing solutions. The time is right to take these insights and build out platform support for these use cases, thus creating a more capable SDK as well as reducing and removing bespoke solutions that carry a fixed maintenance cost over a diminishing value proposition.

Platform/Infra surface for testing

Fuchsia’s in-house testing infrastructure (aka “infra”) exhibits most of the same problems listed above and is affected in similar ways to Fuchsia partners. Since Fuchsia’s infra doesn’t continuously exercise platform solutions and SDK tools for testing, there is a missed opportunity for continuous quality assurance for said solutions and tools.

A growing need for out-of-tree testing

There are several current and upcoming projects that are expected to increase the scope of out-of-tree development & testing targeting Fuchsia. These include:

Compatibility Test Suite (CTS) tests will be able to run outside of Fuchsia’s in-tree build & test system, though their source code will be hosted on fuchsia.git.
Flutter-on-Fuchsia Velocity expects to build and test a Flutter embedder on Fuchsia and a Flutter runner for components and their tests out-of-tree, with at least some integration tests being upstreamed to the Flutter project.
Drivers as Components will include a demonstration of a driver built and tested out-of-tree, to realize the promise of robust hardware support on Fuchsia] via driver ABI stability.
Support for running existing tests on Fuchsia in LLVM and Rust projects will require out-of-tree C++ and Rust testing support.

Currently these projects cannot be successfully completed as they depend on missing support for out-of-tree testing.

Solution statement

We will create a platform solution for testing that works exclusively based on tools and protocols that are publicly available in the Fuchsia SDK.

Host-side

We will use FFX as the entry point for out-of-tree testing. We will finish developing ffx test to handle all host-side aspects of testing. We will rely on the established FFX technologies and practices, such as configuration management, target device discovery, and the Overnet communication suite.

We will replace existing host tools with ffx tools. Tools such as testrunner and Botanist currently perform tasks in Fuchsia CI/CQ - such as device discovery, device setup, test orchestration, test artifact collection - that can be incrementally handed off to ffx. Some of these handoffs will require building equivalent ffx plugins up to parity, for instance bringing up ffx test support for running Bringup tests over a serial connection. The payoff is that we'd get to continuously verify our work on modern tools that are portable between in-tree and out-of-tree use cases using our existing rich and robust corpus of in-tree tests and in-tree automation.

We will port aspects of working with sanitizers and test coverage from tools that are only available in-tree such as tefmocheck and covargs to ffx plugins.

Target-side

We will grow the Test Runner Framework (TRF) to accommodate for the needs of out-of-tree testing.

TRF includes an on-device Overnet daemon, a component to manage/schedule tests, an isolated realm for hermetic testing, a selection of test runners that support a variety of languages and frameworks for writing tests, and FIDL protocols to connect all of the above. TRF supports both in-tree and out-of-tree testing workflows. It replaces a test runtime that only worked in-tree and only supported CFv1 components.

The priority customer for TRF so far has been in-tree testing, with success measured in terms of the portion of tests that run on TRF. At the time of writing more than 70% of Fuchsia in-tree tests have been migrated to TRF, with modern (CFv2) tests running exclusively on TRF. By the end of 2021 we expect all remaining tests except ZBI tests to run under TRF, thanks to an upcoming compatibility layer.

Once all component tests are migrated to TRF, we will turn down the legacy target-side v1-only in-tree test runtime. This will allow us to focus on improving the new testing runtime, improvements that will benefit both in-tree and out-of-tree developers.

To improve the developer experience, out-of-tree developers will be able to include test runners from the SDK. Through this mechanism, out-of-tree developers will have access to the existing inventory of test runners - gtest, rust, Go, and arbitrary ELF binaries - as well as the upcoming Dart and Flutter test runners. TRF will also provide the foundation for developing more advanced testing strategies, such as stress tests and CTS tests. These testing strategies will be expressed as runners that may also be provided in the SDK.

In addition, out-of-tree developers will be able to create and use test runners of their own. This is thought to be feasible today, but it has not been demonstrated yet. We should create a first out-of-tree test runner so we can speak about this workflow with more confidence.

Test execution control

We will finish the development and rollout of new protocols to control test execution.

The new protocols are defined in FIDL form (allowing for ABI stability & evolution, crucial for out-of-tree) and are natively carried by Overnet. The new protocols don’t have specific knowledge of Fuchsia infra, therefore sharpening the so-called platform/infra contract.

The new protocols allow for better layering and separation of concerns. For instance the host side is responsible for test selection and requesting execution on the target. The target-side test manager takes responsibility for actually running the tests, as opposed to the legacy system in which a host tool manually executes each test individually on a target device. The responsibility of parallelizing test execution to maximize resource utilization is transferred to the target, which is better equipped to handle this responsibility.

Lastly, the new protocols are not based on a character stream (SSH). This allows for information to flow both ways with multiple streams of instructions, results, and diagnostics, as well as large test inputs and outputs that may be in binary format.

Test results

Test results will no longer be constrained to a suite-level pass/fail outcome, but will be enumerated in fine detail and with structured format. Diagnostics collected during the test, such as logs captured from the test realm throughout the duration of the test, will be organized in a manner associated with the tests that produced them. There will be standard support for other artifacts from tests, such as profiles collected during test runtime or large test outputs such as screenshots taken during the test. The schema for the result format will be published in the SDK to support processing by out-of-tree tools.

Documentation

Developer guides such as this will be reviewed, edited, and simplified to be useful to out-of-tree developers. In-tree and out-of-tree testing workflows will be unified such that these guides won’t have information specific to Fuchsia in-tree/out-of-tree developers, or separate sections for such different audiences.

A new onboarding guide detailing the "Journey of Testing" will be developed. This guide will provide an entrypoint to developers who need to ensure their code is properly tested with both unit and integration tests. The goals of this guide are to 1) aid developers in making the correct choice about which type of test to write, and 2) quickly bring developers to a point where their tests are prepared to take advantages of more advanced testing strategies (such as CTS and Stress Testing).

Virtualization support

Virtual targets such as emulators are very useful and popular for testing. Fuchsia currently offers to download a distribution of qemu that has been tested to work with Fuchsia. However there are additional tools for working with virtualized targets, such as fx qemu and fx gce, that are only available in-tree.

We will map the gap between in-tree and out-of-tree support for running Fuchsia on virtualized targets, and address it to close testing workflow gaps as needed.

Dependencies

The FFX tool and associated stack.
The Fuchsia IDK, and any SDK frontends used by out-of-tree developers.
Making RealmBuilder available out-of-tree.
Exposing RealmBuilder via the SDK. This includes the underlying protocol, and at least one client library.
Extending RealmBuilder to support additional languages.

Risks and mitigations

N/A

Not in scope

End-to-end testing (aka system testing)

This proposal focuses on component testing, which takes the form of unit testing a single component or integration testing that spans multiple components. System tests, also known as end-to-end (or e2e) tests, don't exercise specific component instances but rather the entire system under test. As such they differ in many ways from component tests such as in developer needs and use cases, and in the platform's ability to offer isolation between e2e tests.

The current popular solution for e2e tests is SL4F (Scripting Layer for Fuchsia). The implementation includes a target-side daemon that is included in some Fuchsia images and can perform a documented set of system automation tasks, and a client library written in Dart that is available in the Fuchsia SDK.

Additionally, there exist CFv1 component tests that are arguably system tests. This is possible because the legacy CFv1 test runtime allows accessing real system services, as a legacy compromise that is intentionally not supported in CFv2 testing. For the purpose of this discussion we consider such strictly non-hermetic CFv1 component tests to be system tests as well.

Current challenges in scaling e2e test development include:

Adding facades to SL4F requires changing platform code and redistributing Fuchsia system images. Out-of-tree developers cannot extend SL4F's capabilities to automate a system.
SL4F does not rely on ffx, but rather uses its own transport layer, protocols, target discovery, and configuration. These differences add to the ongoing maintenance cost and introduce inconsistencies and friction in the developer experience.
Only a Dart client library is provided.

Since system tests are so uniquely different than component tests, this topic is covered in a separate roadmap document.