RFC-0113: Efficient envelopes

RFC-0113: Efficient envelopes
StatusAccepted
Areas
  • FIDL
Description

This FTP proposes a more compact encoding for envelopes.

Gerrit change
Authors
Reviewers
Date submitted (year-month-day)2021-06-21
Date reviewed (year-month-day)2021-07-21

"Turning Envelopes into Postcards"

Summary

This RFC proposes a more compact encoding for FIDL1.

Motivation

Envelopes are the foundation for extensible, evolvable data structures (tables and extensible unions). A more compact and efficient wire format for envelopes enables those extensible structures to be used in more contexts where performance and wire size matter.

Design

The proposed envelope format can be described as the following C-struct:

struct Envelope {
    uint32_t byte_size;
    uint32_t handle_count;
};

Compared with the existing envelope format:

  • The byte size field remains the same (32 bits).
    • The size includes the size of any sub-objects that may be recursively encoded.
    • For example, the size of a vector<string> includes the size of the outer vector's inner string sub-objects.
    • This matches the existing behavior for the current envelope implementation's size field.
  • The handle count field remains the same (32 bits).
    • The handle_count includes the handle count for all recursive sub-objects.
  • The presence/absence field is dropped.
    • Presence is represented by a non-zero value in either the size or handle_count field.
    • Absence is represented by the size & handle count fields both being zero.
  • Validation of byte size field
    • The byte size field MUST be validated to be a multiple of 8.

Decoders MAY overwrite the envelope with a pointer to the envelope data, assuming they know the static type (schema) of the envelope's contents. See the Unknown Data section for recommendations on how to process an envelope if the content's type is unknown.

C/C++ Struct for Encoded/Decoded Form

The encoded or decoded form of an envelope can be described as a C-union:

typedef union {
  struct {
    uint32_t byte_size;
    uint32_t handle_count;
  } encoded;
  void* data;
} fidl_envelope_t;

static_assert(sizeof(fidl_envelope_t) == sizeof(void*));

Unknown data

Receivers — validators & decoders — may not know the type of an envelope when they're used in an evolvable data structure. If a receiver doesn't know the type, an envelope can be minimally parsed and skipped.

  • The envelope's size determines the amount of out-of-line data to skip.
  • If the envelope's handle count is non-zero, a validator MUST either store or close each of the handles.
  • A decoder MAY overwrite the unknown envelope with a pointer to the envelope's contents, if it wishes to decode in-place.
    • If a decoder does overwrite the envelope with a pointer, it will lose the size & handle count information in the envelope. Bindings MAY offer a mechanism for a decoder to save the size & handle count information before overwriting the envelope; this RFC does not express an opinion on how such a mechanism could work.

Implementation strategy

This RFC is a breaking wire format change.

A complex wire format migration will be undertaken to switch to efficient envelopes. This wire format change will be combined with other migrations to reduce the per-feature migration cost.

Backwards compatibility

The proposed wire format change is API (source) compatible. Any hand-rolled FIDL code would need to be updated to handle the new wire format.

The wire format change is ABI-incompatible.

Performance

A performance evaluation was run in a CL that prototypes an efficient envelope implementation. For this test, the input was a table with all fields set. Other inputs produced similar results.

The following times are in nanoseconds. The time without efficient envelopes is before the arrow and the time with efficient envelopes is after the arrow.

# Fields Encode Decode
16 64 -> 40 176 -> 146
64 165 -> 121 321 -> 221
256 567 -> 368 923 -> 527
1024 2139 -> 1429 3284 -> 1636

Depending on the input, using efficient envelopes appears to be 1.1-2x faster

Ergonomics

  • More efficient extensible data structures enable them to be used in more contexts where efficiency matters, so users need to worry less about their performance, and can gain the benefits of extensibility where they would previously need to use non-extensible structures.
  • We may even wish to recommend that tables should be used by default for FIDL data structures, and structs should be reserved for high-performance contexts.
    • Extensible unions (RFC-0061) are already attempting to remove static unions.

Documentation

  • The wire format documentation needs to be updated.
  • When updating the documentation, envelopes should be explained as a first-class concept: this enables better cognitive chunking once readers encounter the wire format for optionality and extensible data structures.
  • We should update the FIDL style guide to make recommendations for when extensible types should be used.

Security

There should not be any security implications from this RFC.

One minor security advantage is that this RFC removes information that is otherwise duplicated in the size and pointer in the old format. Previously, an envelope may be received with non-zero size/handles and FIDL_ALLOC_ABSENT, or zero size/handles and FIDL_ALLOC_PRESENT. This required extra validation checks, which will no longer be needed.

It is not possible to determine if an envelope is in wire form or decoded form off of the data alone. This is not a problem because in practice there is always separate bookkeeping in bindings that keeps track of whether the message is in wire form or decoded form.

Privacy

There should not be any privacy implications from this RFC.

Testing

  • Since this RFC is changing the wire format for envelopes, we feel that the existing FIDL test suite — particularly compatibility tests — will adequately test all scenarios where envelopes are used.
  • If we agree to land the wire format change as a soft transition (see the Implementation Strategy section), we will add tests for peers to negotiate and possibly switch to the new wire format.

Drawbacks, alternatives, and unknowns

We can keep the existing wire format if we believe the efficiency gains in this proposal are not worth the implementation cost.

Previous RFC rejection and argument for approving now

This RFC was previously rejected with the following rationale (copied here verbatim) before being resubmitted for review:

In February 21, 2019, this RFC was initially accepted. The FIDL team worked to stabilize the wire format for most of 2019, culminating in an all-hands-on-deck effort which spanned Q3 and Q4. The migration completed on Dec 1st, 2019.

The stabilization effort spanned multiple changes:

However, as the work unfolded, and the Dec 1st deadline loomed, the FIDL team decided to punt on implementing the efficient envelopes change, preferring to push this work to 2020. Unlike the other changes which were part of the stabilization effort, efficient envelopes was simply an in-memory size saving, which was very small, especially when compared to other aspects of the FIDL wire format (e.g. tables' dense format). Deferring was a project risk reduction calculation, by reducing the scope, the odds of completing all the work on time were improved. So was the FIDL team's work schedule.

We're now close to 18 months after the deferral, and efficient envelopes are long forgotten. Significant performance work in 2020 demonstrated that this change would have no material impact.

It's time to face the truth, this ain't going to happen. Rejected.

Why re-approve now?

The FIDL team is currently planning to batch together several wire format changes and undergo a migration with all of them at once. This means there is opportunity to add support for efficient envelopes in a lower cost way (in that the cost is shared with other migrations).

Additionally, there are now concrete numbers for the performance gains due to efficient envelopes and the gains are significant.

Because of these factors, this is an opportune time to resurrect this RFC and implement it.

Prior art and references

This RFC is a slimmed-down version of rfc-0026, which was rejected since there wasn't enough consensus around the whole RFC.


  1. This RFC is based on rfc-0026, but with only the out-of-line envelope proposal. Inlining, envelopes everywhere, and moving the string/vector count out-of-line, have all been removed.