To test for flakiness in CQ, the infrastructure can run a test multiple times and fail the overall build if there is a single failure. This happens automatically when the infrastructure determines there's a small number of tests affected by the commit being tested (according to the build graph).
Format
A change author can tell the infrastructure to run a specific test many times by
adding a Multiply
footer to the commit message:
Multiply: test_selector
test_selector
can be a test name, a substring of a test name, or an
re2 regular expression that matches
a test name.
Fuchsia component tests are referenced by package URL:
Multiply: fuchsia-pkg://fuchsia.com/foo_tests#meta/foo_tests.cm
Host tests are referenced by path:
Multiply: host_x64/obj/src/bar_tests.sh
Substrings of test names are also accepted:
Multiply: foo_tests
Multiply: bar_tests
Multipliers may be combined into a single comma-separated line:
Multiply: foo_tests, bar_tests
All-caps MULTIPLY
is also accepted.
Example uses of Multiply
from real changes:
Multiply: driver_development_test
Multiply: ffx_daemon_target_lib_test
Multiply: virtual-keyboard-test
Multiply: text_manager_integration_test: 10
Run count
By default, the infrastructure uses historical test duration data to calculate a number of runs. The number of runs is chosen to produce a single multiplied test shard whose duration is similar to the expected duration of the other shards, up to a maximum of 2000 test runs. Slower tests will run fewer times, while faster tests will run more times.
It's sometimes desirable to override the default number of runs (for example, because the default is too high and causes timeouts). In this case you can explicitly specify a number of runs. For example:
Multiply: foo_tests: 100
Limitations
Validation
If there is a typo in your Multiply
clause, or if your Multiply
selector
doesn't match any tests on any builders, it will silently fail to multiply any
tests.
Therefore, it's important to manually verify that the Multiply
took effect.
For every builder on which your Multiply
takes effect, a comment of the
following form will be added to your change in Gerrit:
A builder created multiplier shards. Click the following link for more details:
The comment will include a link to the build that runs the multiplied tests (example).
If no such comment appears, then there probably is an error with the syntax or the test does not run in any of the regular CQ builders. In this case, you have to either add it to the build graph so that it is run by one of the builders or manually choose the tryjob that runs the test if it's run in an optional builder.
If the linked build is completed, you should see a step like multiplied:<shard
name>-<test name>
under one of the passes
, flakes
, or failures
steps. If
the build is not yet completed, you can click on the link under the build
step
named <builder name>-subbuild
, which will take you to the subbuild build page
where you should see a similar multiplied
step. Since the comment doesn't
specify which tests were multiplied, you can look at the build pages to confirm
(in case you multiplied more than one test).
For example:
No more than five matching tests
A single multiplier is not allowed to match more than five tests, to prevent change authors from accidentally multiplying a huge number of tests and overwhelming the testing infrastructure.
If you get a tryjob failure as a result of a Multiply
statement that matches
too many tests, simply edit your commit message locally or in the Gerrit UI to
make your test selector more specific. Then retry CQ.
Changing Multiply
after a CQ dry run passes
If all tryjobs have already passed a CQ dry run and you add or edit a Multiply
clause without making any code changes, subsequent CQ+1 or CQ+2 attempts within
24 hours of the dry run will not re-run the builders and the updated Multiply
clause will not be respected.
This is because the CQ service treats commit message updates as "trivial" and does not invalidate past CQ attempts on the patchset.
To work around this, you can either:
- Manually retry a subset of tryjobs using the Choose Tryjobs menu and wait for them to pass before submitting.
- OR retry all tryjobs by making a non-functional code change (e.g. add a
comment to some code) and uploading a new patchset to invalidate the old
tryjob results. Then retry CQ with the
Multiply
footer present.
Timeouts
The default run count for a multiplied test is based on the historical duration of the test. If your change increases the duration of a multiplied test, the default run count may be too high and cause the task running the test to time out and not report any results.
In this case, you should override the default run count by manually specifying a lower run count, e.g.:
Multiply: foo_tests: 30
No test case multipliers
Multiply
only supports multiplication of top-level suites (Fuchsia test
packages and host test executables). All test cases within a multiplied test
suite will be multiplied.
There is no way to multiply a single test case within a test suite.