| RFC-0138: Handling unknown interactions | |
|---|---|
| Status | Accepted | 
| Areas | 
 | 
| Description | We expand the FIDL semantics to allow peers to handle unknown interactions. | 
| Gerrit change | |
| Authors | |
| Reviewers | |
| Date submitted (year-month-day) | 2021-05-25 | 
| Date reviewed (year-month-day) | 2021-10-27 | 
Summary
We expand the FIDL semantics to allow peers to handle unknown interactions, i.e. receiving an unknown event, or receiving an unknown method call. To that end:
- We introduce flexible interactions and strict interactions to the FIDL language. A flexible interaction, even when unknown, can be gracefully be handled by a peer. A strict interaction leads to abrupt termination. 
- We introduce three modes of operation for protocols. A closed protocol is one which never allows unknown interactions. Conversely, an open protocol is one which allows any kind of unknown interaction. Lastly, an ajar protocol is one which supports only one way unknown interactions. 
A big picture view at FIDL's support for evolution
Before diving into the specifics of this proposal, it is useful to understand how FIDL aims to answer evolutionary concerns.
The problem has two facets: source-compatibility (API), and binary-compatibility (ABI).
API compatibility aims to provide guarantees that user code written against
generated code before a change can still compile against generated code after a
change. As an example, one can reasonably expect that adding a new declaration
to a FIDL library (say defining a new type MyNewTable = table {};) will not
cause existing code using this library to fail to compile.
There is a three pronged approach to solving source-compatibility problems:
- Make as many changes source compatible as possible (e.g. RFC-0057: Default no handles);
- Provide clear guarantees (e.g. RFC-0024: Mandatory source compatibility);
- Provide versioning (e.g. RFC-0083: FIDL versioning).
Separately, ABI compatibility aims to provide interoperability of programs built against different versions of a library. As an example, two programs can have a different understanding of a table's schema and yet be able to successfully communicate.
Achieving ABI compatibility can be broken down into three parts:
- At rest compatibility is concerned with achieving interoperability at a data level, i.e. when can two peers with different schema of the same table interoperate?
- Dynamic compatibility assumes that all data types are compatible, and focuses on achieving interoperability when peers have different versions of a protocol (e.g. different methods);
- Lastly, there are some cases where having divergent protocols is not an option, and where the solution is instead to learn about the capabilities of each peer (negotiation), and then adapt the communication (which protocol is spoken) based on that.
Dynamic compatibility is particularly appropriate when "local flexibility" is sought, such as small additions to an otherwise mostly unchanged model of operation. In other cases, say fuchsia.io1 relative to fuchsia.io2, a domain model shift is required. There "global flexibility" is needed, and solutions sought fall in the protocol negotiation category.
The mechanism we specifically discuss in this RFC (strict and flexible interactions) improves the status quo of dynamic compatibility (2).
Terminology
A reminder about the compositional model of protocols.
Communication between two peers is an interaction. An interaction starts with a request, and may optionally require a response.
Both requests and responses are transactional messages, which are represented as a header ("the transactional header"), optionally followed by a payload.1
An interaction is directed, and we name the two peers client and server respectively. A client to server interaction starts by a request from the client to the server, with the response if there is one in the reverse direction. Similarly, we speak about a server to client interaction.
We often use the term fire and forget or one way for responseless interactions initiated by the client, and the term call or two way for interactions requiring responses (always client initiated in the current model). When the server is the initiating peer of a responseless interaction, it is often called an event.2
A protocol is a set of interactions. We define a session as a particular instance of a communication between a client and a server using a protocol, i.e. a sequence of interactions between a client and a server.
An application error is one which follows the error syntax. A transport error is either an error occurring due to a kernel error (e.g. writing to a channel that was closed), or an error occurring in FIDL.
Motivation
A core principle of Fuchsia is to be updatable: packages are designed to be updated independently of each other. Even drivers are meant to be binary-stable, so that devices can update to a newer version of Fuchsia seamlessly while keeping their existing drivers. FIDL plays a central place in achieving this updatability, and is primordially designed to define Application Binary Interface (ABI), thus providing a strong foundation for forward and backward compatibility.
Specifically, we want to allow two peers with a slightly different understanding of the communication protocol between them to safely interoperate. Better yet, we want the assurance of a strong static guarantee that two peers are 'compatible'.
A lot of work has gone into providing flexibility and guarantees for encoding
and decoding FIDL types, which we call at rest compatibility. We introduced
the table layout, the union
layout, chose explicit union
ordinals, introduced the strict and
flexible layout modifiers,
introduced protocol ordinal hashing,
reduced collision probability of protocol ordinal
hashing, and evolved the transactional
message header format to future proof it.
We now turn to dynamic flexibility and guarantees, which we call dynamic compatibility. Assuming two peers are at rest compatible, i.e. all the types they use to interact are at rest compatible, dynamic compatibility is the ability for these two peers to interoperate successfully, with neither one or the other peer aborting the communication due to an unexpected interaction.
Stakeholders
- Facilitator: jamesr@google.com.
- Reviewers:
- abarth@google.com (FEC)
- bprosnitz@google.com (FIDL)
- ianloic@google.com (FIDL)
- yifeit@google.com (FIDL)
 
- Consulted:
- jamesr@google.com
- jeremymanson@google.com
- jsankey@google.com
- tombergan@google.com
 
- Socialization: RFC draft was shared with the FIDL team, and discussed with various members of the Fuchsia team. It was shared broadly on the Eng Council Discuss mailing list (eng-council-discuss@fuchsia.dev).
Design
We introduce the concept of flexible interactions and strict interactions. Succinctly, even if unknown, a flexible interaction can be gracefully handled by a peer. Conversely, if unknown to the receiving peer, a strict interaction is one which causes that peer to abruptly terminate the session. We refer to the strictness of an interaction to refer to whether it is a flexible or strict interaction. See semantics of flexible and strict interactions.
Without guardrails, flexible interactions could be inadvertently used in ways that jeopardize privacy:
- Consider for instance a rendering engine which is designed to evolve. A new
version adds a flexible SetAlphaBlending(...);one way interaction with the intent that newer clients targeting older renderers will simply have their setting ignored (but most of the rendering will still work). Now, if instead that new method was about a special PII rendering modeStartPIIRendering();it would be crucial for an older renderer to stop processing, rather than ignore this, and hence the use of astrictinteraction would be appropriate.
- Another example would be a malicious peer trying to reflectively discover the exposed surface by sending various messages to see which one(s) are understood. Typically, reflective functionality comes with extra performance cost, and opens the door to privacy issues (you may expose more than you realize). By principle, FIDL chooses to forbid reflection, or require an explicit opt-in.
As a result, we additionally introduce three modes in which protocols can operate:
- A closed protocol is one where no flexible interaction is allowed or expected, receipt of a flexible interaction is abnormal.
- An open protocol is one where any flexible interaction is allowed (be it one way or two way). Such protocols offer the most flexibility.
- An ajar protocol is one where flexible one way interactions are allowed (fire-and-forget calls and events), but flexible two way interactions are not allowed (cannot make a method call if the peer does not know about this method).
For further details, see semantics of protocols.
Semantics of strict and flexible interactions
The semantics of a strict interaction are quite simple: when receiving an unknown request, i.e. one whose ordinal is not known to the recipient, the peer abruptly terminates the session (by closing the channel).
The goal of flexible interaction is to allow recipients to gracefully handle unknown interactions. This has a few implications which guide the design.
The sender of a flexible interaction must know that its request may be ignored (because it is not understood) by the recipient.
The recipient must be able to tell that this request is flexible (as opposed to strict), and act accordingly.
Since a two way interaction requires the recipient to respond to the sender, it is imperative for the recipient of an unknown request to be able to construct a response absent any additional details. The recipient must convey to the sender that the request was not understood. To satisfy this requirement, the response of a flexible two way interaction is a result union (see details).
It follows from the semantics that in the case of a one way interaction, the sender cannot tell whether its request was known or unknown by the recipient. When using flexible one way interactions, FIDL authors should be careful about the semantics of their overall protocols.
It is worth noting that one-way interactions are somewhat of "best effort", in
the sense that the sender cannot tell whether the peer received the interaction.
However, channels provide ordering guarantees such that the sequencing of
interactions is deterministic and known. Strict one-way interactions make it
possible to ensure that some interactions occur if and only if a preceding
interaction was understood. As an example, a logging protocol might have a
StartPii() and StopPii() strict interactions to ensure that no peer ever
ignore these.
For further discussion of the tradeoffs to consider when choosing between a strict and flexible interaction, see also:
Semantics of open, closed, and ajar protocols
The semantics of a closed protocol are restrictive, only strict interactions,
no flexible interactions. It is a compile-time error for a closed protocol to
have any flexible interactions.
The semantics of an ajar protocol allow strict interactions, and one way
flexible interactions. It is a compile-time error for an ajar protocol to
have any flexible two way interactions.
An open protocol has no restriction, both strict and flexible, one way and
two way interactions are allowed.
For further discussion of the tradeoffs to consider when choosing between a closed, ajar, or open protocol, see also:
Changes to the language
We introduce the modifiers strict and flexible to mark interactions as
strict or flexible:
protocol Example {
    strict Shutdown();
    flexible Update(value int32) -> () error UpdateError;
    flexible -> OnShutdown(...);
};
By default, interactions are flexible.
Style guide wise, it is recommended to always indicate explicitly the strictness of an interaction, i.e. it should be set for every interaction.3
We introduce the modifiers closed, ajar, and open to mark protocols
as closed, ajar (partially open), or open:
closed protocol OnlyStrictInteractions { ...
ajar protocol StrictAndOneWayFlexibleInteractions { ...
open protocol AnyInteractions { ...
In a closed protocol, there can be no flexible interaction defined. A closed protocol may only compose other closed protocols.
In an ajar protocol, there can be no two way flexible interaction defined. An ajar protocol may only compose closed or ajar protocols.
(There are no restrictions on open protocols.)
By default, protocols are open.
A previous version of this proposal specified ajar as the default. However, this lead to a conflict where the default value of the openness modifier, ajar, conflicted with the default value of the strictness modifier, flexible, in the case of a two-way method declared without explicit modifiers. This meant that a protocol containing a two way method could not be compiled without a modifier on at least either the protocol or the method. See below: the default value of openness is shown in bold and the default value of strictness is shown in italics.

To resolve this, we changed the default of openness from ajar to open, which allows protocols to compile two way methods without modifiers on either the protocol or the method.
Style guide wise, it is recommended to always indicate explicitly the mode of a protocol, i.e. it should be set for every protocol.[^default-debate]
Changes to the wire format: transactional message header flags
We modify the transactional message header to be:
- Transaction ID (uint32)
- At rest flags (array<uint8>:2, i.e. 2 bytes)
- Dynamic flags (uint8)
- Magic Number (uint8)
- Ordinal (uint64)
i.e. flags bytes are split into two portions, at rest flags two bytes, and dynamics flags one byte.
The dynamic flags byte is structured as follows:
- Bit 7, first MSB "strictness bit": strict method 0, flexible method 1.
- Bit 6 through 0, unused, set to 0.
Some further details about the use of "dynamic flags":
- We added flags in the third version of the transactional message header. These flags were intended to "be temporarily used for soft migrations". As an example, one bit was used during the strict to extensible union migration. However, there are no plans that would require using that many flags at once, and we can therefore change the intent of these flags from solely being used on a temporary basis to being used for as part of the wire format. 
- The strictness bit is required for the sender to indicate to the receiver a - strictinteraction in the case where the receiver is unaware of that interaction. The semantics expected in this case is for the communication to abruptly terminate. Without this strictness bit, such skew between the sender and receiver could go unnoticed. Consider for instance an ajar (or open) protocol with a newly added- strict StopSomethingImportant();one way interaction. Without a strictness bit, the receiver would have to guess whether the unknown interaction is strict or flexible, opting for flexible given the intended evolvability improvements sought in this RFC. As a result, FIDL authors would be forced to rely on two way strict interactions when expanding protocols.
See also placing strictness bit in transactional identifier for a discussion of an alternative representation, and interaction mode bit for an alternative representation future needs may call for.
Changes to the wire format: result union
The result union, which today has two variants (ordinal 1 for success
response, ordinal 2 for error response) is expanded to have a third variant,
ordinal 3, which will carry a new enum fidl.TransportError indicating
"transport level" errors.
As an example, the interaction:
open protocol AreYouHere {
    flexible Ping() -> (struct { pong Pong; }) error uint32;
};
Has a response payload:
type result = union {
    1: response struct { pong Pong; };
    2: err uint32;
    3: transport_err fidl.TransportError;
};
Specifically, if a flexible method uses the error syntax the success type and
error type are set accordingly (ordinal 1 and 2 respectively). Otherwise, if a
flexible method does not use the error syntax, the error variant of the result
union (ordinal 2) is marked reserved.4
Some precisions:5
- We are choosing the name - transport_errsince from an application standpoint, where that error came from should be indistinguishable. There are application errors, and then "transport errors" which is a mix bag of errors due to FIDL encoding/decoding, FIDL protocol errors, kernel errors, etc. Essentially, "transport errors" is the set of all the kinds of errors which can occur in the framework (which includes many layers of software).
- We define the type - fidl.TransportErrto be a strict- int32enum with a single variant,- UNKNOWN_METHOD. The value for this variant is the same as- ZX_ERR_NOT_SUPPORTED; that is -2:- type TransportErr = strict enum : int32 { UNKNOWN_METHOD = -2; };- When presenting transport errors to the client, if the binding provides a way to get a - zx.statusfor an unknown interaction- transport_err, the binding is required to use- ZX_ERR_NOT_SUPPORTED. However, bindings are not required to map unknown interaction- transport_errto- zx.statusif that does not fit how they surface errors to the client.- An alternative approach would be to just use - zx.status, and always use- ZX_ERR_NOT_SUPPORTEDas the value to indicate an unknown method, but that has two significant downsides:- It requires a dependency on library - zx, which may not be directly used by many libraries. This makes it difficult to define the result union in the IR, as we either need to auto-insert a dependency on- zxor downgrade the type to- int32in the IR but have generated bindings treat it as- zx.status.
- It does not define how bindings should handle - transport_errvalues which are not- ZX_ERR_NOT_SUPPORTED. By specifying that the type is a strict enum, we clearly define the semantics for bindings which receive a- transport_errvalue which is not recognized; it is then treated as a decode error.
 
- We refer to "the result union" singular for simplicity when in fact we describe a class of union types which share a common structure, i.e. three ordinals, first variant is unconstrained (the success type can be anything), second variant must be - int32,- uint32, or an enum thereof, and the third variant must be a- fidl.transport_err.
Changes to the JSON IR
We expose the strictness for interactions in the JSON IR. In practice, we update
the #/definitions/interface-method type, and add a strict boolean as a
sibling of ordinal, name, is_composed, etc.
We expose the mode of a protocol in the JSON IR. In practice, we update the
#/definitions/interface type, and add a mode enum with members closed,
ajar and open as a sibling of composed_protocols, methods, etc.
Changes to the bindings
We want to have bindings visible manifestations of automatic handling of requests. For instance, while the bindings may be able to automatically construct a request indicating that the request was unknown, it is important to both raise that an unknown request was received (possibly with some metadata about the request), and the choice to respond with "request unknown" or abruptly terminate the communication.
At rest concerns.
- In the case of flexible interactions, the bindings should present the - transport_errvariant of the result union to the client through the same mechanism that they use to present other transport-level errors such as errors from- zx_channel_writeor errors during decoding. The- errand- responsevariants of the result union should be presented to the client the same way that the bindings would present those types if the method was declared as strict.- For example, in the Rust bindings, - Result<T, fidl::Error>is used to present other transport-level errors from calls, so- transport_errshould be folded into- fidl::Error. Similarly, in the low-level C++ bindings,- fit::result<fidl::Error>is used to convey transport-level errors, so- transport_errshould be merged into- fidl::Error. The- responseand- errvariants would be conveyed the same way as for a strict method. In Rust that would mean- Result<Result<T, ApplicationError>, fidl::Error>for a method with error syntax, or- Result<T, fidl::Error>for a method without error syntax, with the- responsevalue being- Tand the- errvalue being- ApplicationError.
- For bindings which fold errors into a - zx.status, the- transport_errvalue- UNKNOWN_METHODmust be converted to- ZX_ERR_NOT_SUPPORTED.
 
Dynamic concerns.
- When sending a request using zx_channel_write,zx_channel_callor their siblings, the dynamic flags must be set as follows:- Strictness bit (bit 7) must be set to 0 for strict interactions, and must be set to 1 for flexible interactions.
- The next six bits must be set to 0.
 
- When receiving a known interaction:
- No change from how bindings work today.
- Specifically, bindings should not verify the strictness to ease the migration from strict to flexible interactions (or vice versa).
 
- When receiving an unknown interaction (i.e. unknown ordinal):
- If interaction is strict (as indicated by the received strictness flag):
- Bindings must close the communication (i.e. close the channel).
- If interaction is flexible (as indicated by the received strictness flag):
- For closed protocols, bindings must close the channel.
- If the interaction is one way (transaction id is zero):
- Bindings must raise this unknown interaction to the application (details below).
 
- If the interaction is two way (transaction id is non-zero):
- For ajar protocols, bindings must close the channel.
- For open protocols, bindings must raise this unknown interaction to the application (details below).
 
- Details about raising an unknown interaction:
- If the interaction is two way, bindings must respond to the request by
sending a result union with the third variant selected, and a
fidl.TransportErrofUNKNOWN_METHOD. This must happen before the unknown interaction is raised to user code.
- Bindings should raise the unknown interaction to the application, possibly by invoking a previously registered handler (or similar).
- It is recommended for bindings to require the registration of an unknown interaction handler to avoid building in "default behavior" that could be misunderstood. Bindings can offer a "no-op handler" or similar, but it is recommended for its use to be explicit.
- Bindings MAY choose to offer the option to the application to close the channel when handling unknown interactions.
 
- If the interaction is two way, bindings must respond to the request by
sending a result union with the third variant selected, and a
 
When an unknown message contains handles, the server must close the handles in the incoming message. The server must close all handles in the incoming message before:
- closing the channel, in the case of a strict method, a flexible method on a closed protocol, or a flexible two-way method on an ajar protocol
- replying to the message, in the case of a flexible two-way method on an open protocol
- notifying user code of the unknown method call, in the case of a flexible one-way method on an open or ajar protocol.
Likewise, when a client receives an unknown event which contains handles, the client must close the handles in the incoming message. The client must close all handles in the incoming message before:
- closing the channel, in the case of a strict event or a flexible event on a closed protocol.
- notifying user code of the unknown event, in the case of a flexible event on an open or ajar protocol.
In general, when an unknown interaction is handled, the order of operations is as follows.
- Close handles in the incoming message.
- If applicable, close the channel or send the UNKNOWN_METHODreply.
- Raise the unknown interaction to the unknown interaction handler or report an error.
In asynchronous environments where multiple threads may be simultaneously attempting to send/receive messages on the channel, it may not be possible or practical to guarantee the channel is closed before reporting the unknown method error. Therefore it is not required to close the channel before reporting an error for an unknown method or event when that interaction is fatal. However, for recoverable unknown interactions as specified in this RFC, it is required to close handles and reply (if applicable) before dispatching the unknown interaction handler.
Previous versions of this RFC did not specify ordering between closing handles in incoming messages, responding to unknown two-way methods, and raising unknown interactions to the user.
Compatibility implications
ABI compatibility
Changing an interaction from strict to flexible, or flexible to strict
is not ABI compatible.
Changing a protocol mode (e.g. from closed to ajar) is not ABI
compatible. While it might seem like changing from a more restrictive mode to a
less restrictive mode could be ABI compatible, it actually is not due to
protocols defining both the sender and receiver side, at once (fire-and-forget
and events).
All changes can be soft transitioned. Modifiers can versionned if need be.
Source compatibility
Changing an interaction from strict to flexible, or flexible to strict
may be source compatible. Bindings are encouraged to offer the same API
regardless of the strictness of interactions, by folding existing transport
error apis.
Changing a protocol mode (e.g. from closed to ajar) is not
source compatible. Bindings are encouraged to specialize the API they offer
depending on the protocol mode. As an example, a closed protocol does not need
to offer an "unknown method" handler, and is encouraged not to provide such
a handler which will go unused.
Relation to platform versioning
As detailed in the evolution section of RFC-0002, we "change the ABI revision whenever the platform makes a backwards-incompatible change to the semantics of the Fuchsia System Interface".
One metric of how well we achieve our updatable goal is the pace at which we mint new ABI revisions. Since adding or removing flexible interactions can be made in a backwards compatible way, this feature will help with improving Fuchsia's updatability.
Implementation
- We can imagine a world where bindings only implement the strict part of the spec, this would be safe in that communication would stop early, as if the peer had encountered some other error or bug.
- Given importance of evolvability to FIDL, the #1 goal, this is not a desirable future, and we therefore require bindings to adhere to this specification.
- In order to comply with the bindings specification, bindings MUST implement strict and flexible interaction semantics, as well as the three modes for protocols.
- With that in mind, we detail changes to the bindings specification. This is ABI breaking, and is a major evolution of the wire format (which covers both "at rest" and "dynamic" concerns).
A previous version of this RFC called for gating the rollout of unknown
interactions behind a new magic number. However, as specified, unknown
interactions is backwards compatible with existing protocols, since the header
bit used to indicate strictness was previously unused/reserved and the wire
format only changes for flexible two way methods, which can only exist in open
protocols. Instead of changing the magic number, we will use a two stage rollout
where we enable unknown interactions support but have the default modifiers set
to closed and strict, then add those modifiers explicitly to existing FIDL
files, then change the defaults to open and flexible.
Performance considerations
No impact to closed protocols. It is not necessary for closed protocols to
check the strictness bit, as noted in the changes to the
bindings section.
Small impact for ajar and open protocols:
- Processing unknown interaction is similar to handling a known interaction, a pre-registered handler is invoked, and application code is run.
- Furthermore, in the case of a two way unknown interaction (only openprotocols), a response will be constructed and sent by the bindings.
It is our expectation that performance considerations rarely matter, and that the choice between protocol mode be mostly guided by security considerations.
Ergonomics
This makes FIDL more complex to understand, but addresses a very important need around evolvability which has been a sharp edge until now.
Backwards Compatibility
This features is not backwards compatible, and will require a soft migration of all FIDL clients and servers.
Security considerations
Adding the ability to send unknown requests to peers (i.e. in the case of flexible interactions) opens the door to security concerns.
For particularly sensitive protocols, evolution concerns may need to be
preempted by the need for very rigid interactions, and therefore favor the
use of closed protocol. It is expected that most of the inner bowels of
Fuchsia rely on closed protocols (e.g. fuchsia.ldsvc).
When considering ajar or open protocols, there are two concerns
that FIDL authors need to consider:
- Malicious peer sending unknown requests with large payloads. (This is similar
to the concern with exists when using flexibletypes which can carry large unknown payloads as well.) As noted in size is ABI-impacting further features are required to provide control to FIDL authors, and will be addressed in future work.
- Opening the door to protocol sniffing, where a peer attempts to discover which methods are implemented without a priori knowledge, then work to craft a message to exploit discovered methods. This can be problematic if an implementation exposes more methods than intended. For instance, intending to expose a parent protocol but instead binding a child protocol composing the parent. Note that the attack vector is not changed by flexible interactions, but it may be more easily exploitable due to the ability for a peer to attempt multiple ordinals one after the other, without having to reconnect (which could be prohibitively expensive in some cases).
- When balancing between opting for an ajarversus anopenprotocol, consider that a peer is unable to tell whether a one way interaction was processed or ignored, whereas in the case of a two way unknown interaction (as anopenprotocol allows), the processing peer discloses its inability to understand an interaction, and in so doing, may reveal valuable information to a malicious peer.
Privacy considerations
Opening the door to protocol sniffing could lead to privacy concerns. As noted in the security considerations section, this threat model is not changed by this RFC but it could be exploited more easily.
Testing
The key to developing the new set of functionality described in this RFC is ensuring that all bindings follow the same specification, and all behave similarly. To that end, one needs to be able to express the specification in tests, e.g. "send this request, respond with correct transaction id, but wrong ordinal, expect sender channel to close". It is our experience that additional focus on fluently expressing the specification results in increased testing, and as a result, increased compliance by all bindings to the spec, along with increased regression protection.
We will follow the same approach taken with encoding and decoding, which culminated in the development of GIDL: start by writing tests by hand, exercise as many bindings as possible, and little by little generalize the parts that can with an eye towards a declarative based testing approach. While it is our hope that we can build a similar tool than GIDL for dynamic concerns, and what we will strive towards, we are not anchoring this as a end-result and may instead prefer fluently expressed tests written by hand.
Documentation
There will be extensive documentation for this feature. On the specification side:
Additional entries in the FIDL API Rubric will be added covering protocol evolution.
On the concrete use of this feature in a given target language, we expect every single binding to update its documentation, and provide working examples.
Drawbacks, alternatives, and unknowns
Drawback: maximum size of message is ABI-impacting
An issue with dealing with unknowns, be it unknown payloads as can be experienced
with flexible types or unknown interactions as introduced here, is that the
maximum size of a message expected to be read by a peer is ABI-impacting,
without this limit ever being explicitly described, not statically verified.
Currently, there is no vectorized read of a channel, nor is there the ability to do a partial read. As a result, a message can be sent to a peer which satisfies all requirements (e.g. flexible interaction, when peer is expecting) and yet, result in failed communication thus breaking ABI. If the message in question is too big for the peer to read because that peer expects messages say of less than 1KiB, then a new message that is over that limit will never be read, and instead the channel will be closed, and the communication between the two peers aborted.
The introduction of flexible interactions increases the likely occurrences of
such a problem, already present due to flexible types.
Some ideas for future direction might be:
- A vectorized channel read, making it possible for a recipient to for instance only read the header of a message, then decide whether to read the rest of the payload or discard that message (that would also require a new syscall).
- Making the maximum size of a message an explicit property of a protocol,
possibly with pre-defined size categories such as small,medium,large, orunbounded.
Alternative: comparison to the command pattern
The command pattern is useful to allow clients to batch many requests to be processed by a server. It is also possible to use the command pattern to achieve the kind of evolvability described in this RFC.
Consider for instance:
open protocol AnOpenProtocol {
    flexible FirstMethod(FirstMethodRequest) -> (FirstMethodResponse);
    flexible SecondMethod(SecondMethodRequest) -> (SecondMethodResponse);
};
This can be approximated with the closed protocol which follows, i.e. this is what one would have to resort to with the FIDL feature set today to achieve the same level of evolvability:
closed protocol SimulateAnOpenProtocol {
    strict Call(Request) -> (Response);
};
type Request = flexible union {
    1: first FirstMethodRequest;
    2: second SecondMethodRequest;
    ...
};
type Response = flexible union {
    1: first FirstMethodResponse;
    2: second SecondMethodResponse;
    ...
    n: transport_err zx.status;
};
Unsurprisingly, the command pattern approach is unsatisfactory.
Since we have to match each request to a response in the union, we lose syntactic enforcement of "matching pairs" which in turn also causes a loss of syntactic locality.
Since an unruly server could respond with SecondMethodResponse to a
FirstMethodRequest, we also lose type safety. One could argue that smart
bindings could notice this pattern, maybe with the help of an @command
attribute`, and provide the same ergonomics we do today for methods.
At a wire level, the command pattern forces "two method discriminators" of
sorts. We have the ordinal in the transactional message header (identifying
Call is the interaction), and we have the union ordinal (identifying which
variant of the union is selected, i.e. 1 for FirstMethodRequest, 2 for
SecondMethodRequest).
Here again, one could argue that if all methods followed the command pattern, i.e. all methods' requests and responses were unions, we would not need the ordinal in the transactional message header. Essentially, the flexible protocol described above would "compile down to" the closed protocol using the command pattern. The wire format of a union requires counting the bytes and handles of the variant, and requires these counts to be validated by a compliant decoder. This is problematic on two fronts:
- The rigidity which the transactional message header allows (no description of the payload, decode if you can) is one that is unmatched by the union wire format (by design, actually). This rigidity and simplicity is particularly well suited for low level uses, which FIDL over rotates towards. 
- The compositional model does not have any sense of "a protocol grouping". This is very powerful since we can (and do) multiplex multiple protocols over the same channel. We use structured composition when possible (i.e. - composestanza), and also resort to dynamic composition (e.g. service discovery). If we took the view that "all compiles down to a union" we would impose a rigid grouping.
Lastly, there has been a desire from certain FIDL authors to have "automatic
batching of requests". For instance, the
fuchsia.ui.scenic library is famous for its
use of the command pattern in the fuchsia.ui.scenic/Session.Enqueue method.
However, providing "automatic batching of requests" is a dangerous feature to
consider since the semantics of how to process multiple commands in one unit
tend to differ widely from one application to another. How should we deal with
unknown commands? How should we deal with commands that fail? Should commands be
ignored, stop execution, cause an abort and rollback? Even RDBMs systems which
are designed around the notion of 'a batched unit of work' (a transaction) tend
to offer many batching modes ([isolation
levels)(https://en.wikipedia.org/wiki/Isolation_(database_systems))). Suffice it
to say that FIDL has no plans to support "automatic batching of requests".
All in all, while on the surface it might look like the semantics of strict and flexible interactions are the same as the command pattern, they are sufficiently different that special semantics are warranted.
Alternative: protocol negotiation
What is protocol negotiation
Protocol negotiation is a broad term describing the set of techniques for peers interacting with each other to progressively build up context about each other, thus allowing them to have correct, faster, more efficient communication.
For instance, imagine calling a phone number at random. Maybe the peer will start with "So and so, yes?". You went from no context about the peer to some identification. We can continue with "Oh, so and so. Did I get this right?". Given the prevalence of marketing calls, it's likely that you now be faced with a "What is this call about? Who are you?". And so on, so forth. Both peers little by little discovering who the other is, and what capabilities they have.
- Which data elements are understood? Like indicating to the peer the fields of a table which are desired, being cautious to avoid the peer generating lots of complicated data only to be ignored upon receipt.
- What methods does the peer support? In a rendering engine, you can imagine asking whether alpha blending is available as a feature, and if not, adapting the interactions with the renderer (possibly by sending different content).
- What performance characteristics should be used? It is common to negotiate the size of buffers, or the frequency of calls one is allowed to make (think quota).
Each kind tends to require slightly different solutions, though all are essentially turning an abstract description of an interaction model (e.g. "the set of methods a peer understands") into data which can be exchanged.
To solve protocol negotiation well, the first step is to provide a way to
describe these concepts ("a protocol", "the response type of method foo"). And
because the peers are starting with a low context world, i.e. they do not know
about each other, and must assume that they have a different definition of the
world, the description of the concepts tend to rely on structural properties.
For instance, saying "response type is MyCoolType" is meaningless and up to
interpretation, but saying "response type is struct { bool; }" stands on its
own and can be interpreted context-free.
How protocol negotiation relates to strict and flexible interactions
What is proposed in this RFC, strict and flexible interactions, provides some wiggle room when it comes to evolving protocols. Now, it is possible to add or remove methods. Maybe even a few more. But, abuse evolution powers, and you end up with a protocol that becomes amorphous, and whose domain is hard to understand from its shape. This is similar to tables which overtime will have a myriad of fields because they now represent a sort of "aggregate struct" combining multiple set of requirements which changed over time.
In contract protocol negotiation makes it possible -- when used well -- to
isolate the versioning burden, and after some dynamic choice (the negotiation),
land on a much cleaner and rigid protocol (possibly a closed protocol).
Both techniques to evolution have their place, and they are both needed in the tool box of evolution.
Alternative: placing strictness bit in transactional identifier
Using transactional identifiers to convey the bits required for strict and
flexible interactions has one important drawback. Some transactional identifiers
are generated by the kernel, i.e. zx_channel_call treats the first four
bytes of a message as a transaction identifier of type zx_txid_t. Packing more
information into the transactional identifiers forces a stronger coupling
between the kernel and FIDL, which is not desirable. By using transactional
header flags instead, FIDL code using zx_channel_call can continue to
structure everything in the header except for the identifier.
Alternative: interaction mode bit
An earlier versions of this RFC called for adding an "interaction mode" bit to delineate one way interactions from two way interactions, and expected to expand to more complex interactions such as terminal interaction).
The main drawback if that the interaction mode bit is redundant with the information provided in a transaction identifier: one way interactions have a zero transaction identifier, two way interactions have a non-zero transaction identifier. Due to information redundancy, this opens the door to different implementations (e.g. bindings) using different subsets of the redundant bits to decide how to process the message. This in turns opens the door to maliciously crafting a message which is interpreted differently by different parts of the system.
While we have the ambition to both assign transaction identifiers to all interactions, and expand interaction modes, both changes that would necessitate extra bits as discussed in the interaction mode, we prefer to table this design discussion to when those features will be designed.
Alternative: on naming
As this RFC iterated, there was a lot of discussion about how to properly name the new concepts introduced. We summarize here some of that discussion.
To delineate interactions which can be "unknown" versus those which need to be "known":
- openand- closedoriginal names chosen.
- (none)and- requiredin the sense that your peer must implement the method, else the protocol is terminated.
- Finalist: flexibleandstrictborrowing from RFC-0033: Handling of unknown fields and strictness.
To delineate protocols which can never receive unknown interactions, from protocols which can receive one way unknown interactions, from protocols which can receive both one way and two way interactions:
- static,- standard,- dynamicoriginal names chose. A slight drawback of "static" and "dynamic" is that we have been using the terms "at-rest" and "dynamic" to refer to the wire format and messaging aspects of FIDL. For example, part of this RFC refer to "dynamic concerns" which has a different meaning ascribed to "dynamic" as compared to "dynamic protocols".
- strict,- (none),- flexibleagain borrowing from RFC-0033.
- In lieu of static, usingsealedto highlight that the protocol cannot expand easily.
- In lieu of standard, usinghybridormixed.
- Finalist: closed,ajar, andopen. Since open and closed are not used for interactions, we can put them to use for protocol modifiers. The definition of ajar is literally "partially opened" which is exactly the concept we mean to describe. Yes, all concerned felt it had a bit of a spooky twist to it.
Prior art and references
(As mentioned in the text.)
- 
Confusingly, a message (as opposed to a transactional message) refers to the encoded form of a FIDL value. ↩ 
- 
For fidlcand JSON IR aficionados, note that the internals of the compiler represent an event as amaybe_request_payloadequalnullptrandmaybe_response_payloadispresent. From a model standpoint however, we call this payload a request but with a server-to-client direction. We should align to the compositional model, changefidlcand the JSON IR. This is out of scope of this RFC, but noted for completeness. ↩
- 
We prefer having a liberal grammar, along with a style guide enforced by linting. This is design choice is motivated by wanting to both have a more approachable language to newcomers, while at the same time having very explicit (and in turn verbose) standards for the Fuchsia platform. ↩ 
- 
It is worth noting that adding an errorto aflexibleinteraction can be made as a soft ABI compatible change. ↩
- 
We later renamed transport_errandTransportErrtoframework_errandFrameworkErr, respectively. See https://fxbug.dev/42061151 for details. ↩