Documenting performance tests

Each performance test should come with an English-language description of what the test measures, typically in a single sentence.

For example, "Measure the time taken for an IPC round trip between processes, using Zircon channels".

An exception is for trivial tests (primarily one-liners) where the code does not need summarizing. For a family of very similar tests, only a single description is necessary.

It may also be useful to say what the intended use cases for the test are. See the list of potential use cases for performance tests.

Rationale

Having a description of a test is more important for performance tests than for correctness tests (pass-or-fail tests), because the interpretation of performance tests is more subtle than for pass-or-fail results.

For example, if your CL breaks a correctness test so that the test always fails, it is unambiguous that you must either find a way to make the test pass or remove the test. In contrast, if your CL makes a test 10% slower, it is much less clear whether that matters. If a change has made a test 50% slower 1% of the time, it is even less clear whether that should be considered important.

Furthermore, more things in the codebase can affect the results from a performance test than the pass-or-fail results from a correctness test. To be more precise, there are many ways the codebase can be changed that would not affect correctness but would affect performance.

This tends to mean that more people need to interpret performance results than pass-or-fail results of a given test. For example, if a change to component A causes a regression in component B's performance tests, the meaning of those performance tests may need to be interpreted by the maintainers of component A and the maintainers of component B, and by other people triaging post-commit performance regression issues.

As a result, there should be a higher bar for documenting performance tests than for correctness tests.

The description of what a test measures will often be much shorter than the test code, so providing a description will potentially save developers from spending a lot of time reading the code to figure out what it is intended to measure.

Location

The description of a performance test can be put in comments in the test code, or in nearby Markdown files.

We currently do not have a browsable list of performance tests and their descriptions, or a way to extract test descriptions from the code, but we might add one of these in the future.

Examples

Examples from Fuchsia microbenchmarks:

"Test IPC round trips using Zircon channels where the client and server both use Zircon ports to wait."

(Source: microbenchmarks for IPC round trip times, src/tests/microbenchmarks/round_trips.cc)

"Measure the times taken to enqueue and then dequeue a message from a Zircon channel, on a single thread. This does not involve any cross-thread wakeups."

(Source: microbenchmark for channels, src/tests/microbenchmarks/channels.cc)
Example from storage performance tests:

"Measures the time to write blocks of size 8KiB in turn to a newly-created file, using the system filesystem, without any explicit syncing or flushing. The resulting metric is the time for each per-block write to return."

(Source: old version of garnet/bin/odu/README.md; the current description is longer)