RFC-0076: FIDL API Summaries

RFC-0076: FIDL API Summaries
Status	Accepted
Areas	FIDL
Description	Presents a format for a human-readable FIDL API surface.
Issues	68955
Gerrit change	480497
Authors	fmil@google.com
Reviewers	pascallouis@google.com mkember@google.com jeremymanson@google.com gridman@google.com
Date submitted (year-month-day)	2021-03-16
Date reviewed (year-month-day)	2021-03-16

Edit this RFC

Edit RFC metadata

Summary

Presents a summarization approach to describe FIDL API surface, with a human-readable format as the first output, and proposes to leverage this summarization to identify API changes to FIDL library in the Fuchsia Source Tree.

Amendment (Aug 2022). This RFC describes a human-readable text format with each API element on a single line. During implementation, a JSON format containing the same information was added (https://fxrev.dev/480357). Unlike the text format, the JSON format can be parsed back into Go data structures, which is particularly useful for fidl_api_diff. Since only the JSON format is used today, we have removed the text format to ease maintenance.

Motivation

At the time of this writing, multiple efforts have been started on the Fuchsia project with a common goal of tracking changes to the platform's API surface. Once complete, the collective result will enable us to use versioning to decouple the platform development from the library versions used by the SDK consumers.

Specifically in the domain of FIDL, there is a need for a human-readable representation of the API surface of a FIDL library. This representation, from here on called a "summary" can be used in multiple ways:

As a human-friendly inventory of the API offered by a FIDL library.

Similar inventories are being kept by other software producing API surfaces, such as go. This allows attributing a version to the specific API summary, in the contexts where such versioning matters.
As a basis for detecting backwards-incompatible changes in FIDL APIs.

API summarization can be used to compute a difference between two API surfaces, yielding an automated way of checking whether one API surface can be evolved into another. This is a precision improvement over the currently used method, in which a stable (called "normalized") form of the library sources is generated by concatenating in a predictable way all the source files and removing comments and irrelevant spacing.
As a building block in other efforts such as the Compatibility Testing Suite (CTS, see RFC-0015) used to detect the tests that need to run after an API change.

CTS in particular needs to trim down the battery of tests that are run on a platform change. Knowing what changed in an API surface may allow the software to run only the tests that are affected by the change, saving execution time and computing resources.

Introductory example

Consider the following FIDL library definition, taken from fuchsia.accessibility.gesture. The comments have been pared down, but the library is otherwise complete.

library fuchsia.accessibility.gesture;

/// Maximum size of a returned utterance.
const uint64 MAX_UTTERANCE_SIZE = 16384;

/// Gesture types that accessibility offers to a UI component for listening.
enum Type {
    THREE_FINGER_SWIPE_UP = 1;
    THREE_FINGER_SWIPE_DOWN = 2;
    THREE_FINGER_SWIPE_RIGHT = 3;
    THREE_FINGER_SWIPE_LEFT = 4;
};

/// An interface to listen for accessibility gestures.
protocol Listener {
    /// When accessibility services detect a gesture, the listener is informed
    /// of which gesture was performed.
    OnGesture(Type gesture_type) -> (bool handled, string:MAX_UTTERANCE_SIZE? utterance);
};

/// An interface for registering a listener of accessibility gestures.
[Discoverable]
protocol ListenerRegistry {
    /// A UI registers itself to start listening for accessibility gestures
    /// through `listener`.
    Register(Listener listener) -> ();
};

An API summary of the above library looks like this:

protocol/member fuchsia.accessibility.gesture/Listener.OnGesture(fuchsia.accessibility.gesture/Type gesture_type) -> (bool handled,string:16384? utterance)
protocol fuchsia.accessibility.gesture/Listener
protocol/member fuchsia.accessibility.gesture/ListenerRegistry.Register(fuchsia.accessibility.gesture/Listener listener) -> ()
protocol fuchsia.accessibility.gesture/ListenerRegistry
const fuchsia.accessibility.gesture/MAX_UTTERANCE_SIZE uint64 16384
enum/member fuchsia.accessibility.gesture/Type.THREE_FINGER_SWIPE_DOWN 2
enum/member fuchsia.accessibility.gesture/Type.THREE_FINGER_SWIPE_LEFT 4
enum/member fuchsia.accessibility.gesture/Type.THREE_FINGER_SWIPE_RIGHT 3
enum/member fuchsia.accessibility.gesture/Type.THREE_FINGER_SWIPE_UP 1
strict enum fuchsia.accessibility.gesture/Type uint32
library fuchsia.accessibility.gesture

A few points are of note:

Each API element is a single line of text.
Each API element is referred to by its fully qualified name.
The order in which the API elements appear in the summary is fixed. If the order of declarations in the FIDL file were to be changed this would have no effect on the shape of the API summary.
It is easy to use text tools like grep to extract parts of the summary. For example, assuming that the API summary is in the file named fidl.api_summary, the following command line extracts only the API surface for the protocol:
```
cat fidl.api_summary | grep "fuchsia.accessibility.gesture/ListenerRegistry"
```
It is similarly easy to extract methods only:
```
cat fidl.api_summary \
  | grep "fuchsia.accessibility.gesture/ListenerRegistry" \
  | grep "protocol/member"
```
A rudimentary API surface diff can be generated by:
```
diff -u fidl.old.api_summary fidl.new.api_summary
```
(assuming that fidl.{old,new}.api_summary contain the original and modified API surfaces respectively)

Requirements

The API summary SHOULD be human-readable and amenable to processing with simple tools, like grep and diff.
The API summary produced MUST list all and only the elements of the FIDL library which have API surface impact.

Design

The API summary format contains the information about the library which has API impact. This information is defined in the section Definitions: Source Compatibility and Transitionability of RFC-0024, and captured in the FIDL bindings spec. It is really just a subset of the information that is already present in the FIDL IR, but presented in a manner that is easier for humans to read and text utilities to process. Refer to the summary of rules for a comprehensive list.

Every FIDL language construct is covered by the summarization rules.

Each FIDL declaration is named using fully qualified names. So, for example, in the shortened snippet taken from the example above:

library fuchsia.accessibility.gesture;
enum Type { THREE_FINGER_SWIPE_UP = 1; };
protocol Listener {
  OnGesture(Type gesture_type);
};

the identifier OnGesture is always referred to as fuchsia.accessibility.gesture/Listener.OnGesture.

The file format is deliberately kept flat for ease of reading and processing. This means that FIDL members (appearing in scopes such as struct or protocol) are listed in separate lines of text. This gives us some leeway to extend the format in the future if needed. For example it would become possible to include future versioning attributes once they are available.

A single API summary file lists all declarations that appear in the entire FIDL library, regardless of how many files the declarations are specified in.

The order in which the declarations appear in the API summary is declaration-order-independent and stable. Related declarations are deliberately kept close for easier post-processing, though this is not a correctness requirement: any declaration-order-independent and stable ordering would have been sufficient.

API Summary ordering

The ordering of declarations is derived from the FIDL AST: declarations are written out consistent with a post-order traversal of the AST, picking identifiers with alphanumerically smaller fully qualified names first when selecting among siblings.

We call this ordering the API Summary ordering.

This approach suggests the following ordering of the declarations in the API summary file:

fuchsia.accessibility.gesture/Listener.OnGesture
fuchsia.accessibility.gesture/Listener
fuchsia.accessibility.gesture/ListenerRegistry.Register
fuchsia.accessibility.gesture/ListenerRegistry
fuchsia.accessibility.gesture/MAX_UTTERANCE_SIZE
fuchsia.accessibility.gesture/Type.THREE_FINGER_SWIPE_DOWN
fuchsia.accessibility.gesture/Type.THREE_FINGER_SWIPE_LEFT
fuchsia.accessibility.gesture/Type.THREE_FINGER_SWIPE_RIGHT
fuchsia.accessibility.gesture/Type.THREE_FINGER_SWIPE_UP
fuchsia.accessibility.gesture/Type
fuchsia.accessibility.gesture

for the example FIDL library given above.

The ordering of declarations in the API summary would have been the same regardless of how the declarations were actually ordered in the .fidl files, including if they were split across multiple files.

API summary declaration schema

A simplified BNF of the API summary file is specified below for reference.

summary          ::= declaration_list

declaration_list ::= declaration
                   | declaration "\n" declaration_list
declaration      ::= library
                   | const
                   | bits
                   | bits_member
                   | enum
                   | enum_member
                   | struct
                   | struct_member
                   | union
                   | union_member
                   | protocol
                   | protocol_member
                   | alias

alias           ::= "alias" fqn
bits            ::= strictness "bits" fqn fp
bits_member     ::= "bits/member" fqn
const           ::= "const" fqn d fv
enum            ::= strictness "enum" ft
enum_member     ::= "enum/member" fqn fv
library         ::= "library" fqn
protocol        ::= "protocol" fqn
protocol_member ::= "protocol/member" fqn d
struct          ::= resourceness "struct" fqn
struct_member   ::= "struct/member" fqn ft [ fv ]
union           ::= strictness "union" fqn
union_member    ::= "union/member" fqn

resourceness    ::= "" | "resource"
strictness      ::= "flexible" | "strict"

d   ::= <FIDL protocol member type signature>
fp  ::= <FIDL primitive type>
fqn ::= <FIDL identifier>
ft  ::= <FIDL type>
fv  ::= <FIDL value>

Implementation

The API summarization is implemented by the program fidl_api_summarize. The program takes as input a FIDL IR and outputs the FIDL API summary, with both file names specified as flags. Invokers SHOULD use the extension .api_summary for the output of this program as a matter of convention, although this is by no means a hard requirement.

Performance

fidl_api_summarize is a simple transformation of the FIDL IR file. Spot-checking shows that the program completes its run in ~0.1s when run on a reasonably large library. This means that the program is likely acceptable to be run on every FIDL library as part of the regular build process.

Security considerations

The current implementation of fidl_api_summarize makes no attempt to validate the FIDL IR, and assumes that its input is always generated as a valid output of fidlc. This may make the program amenable to be confused by malformed input, although it is hard to tell whether that can be used as an attack vector to the Fuchsia build process.

Privacy considerations

The information that fidl_api_summarize processes has so far been part of the code repositories that are publicly viewable. It is reasonable to assume that whatever privacy rules apply to its input would also apply to its output.

This means that, if it is ever used to summarize non-public FIDL library code, its output should be held to the same privacy standard as the library code it was used on.

Testing

The program is tested using an extensive library of sample inputs which are processed and compared to local outputs. This ensures consistent results over the lifetime of the Fuchsia code base.

Documentation

The fidl_api_summarize use should be documented with the FIDL help pages on https://fuchsia.dev.

Prior art and references

The go language API regularly produces API surface summaries.