Identifier Uniqueness — SnowFlake vs SNOW_FLAKE
The FIDL specification and front-end compiler currently considers two identifiers to be distinct based on simple string comparison. This proposes a new algorithm that takes into account the transformations that bindings generators make.
Language binding generators transform identifiers to comply with target language constraints and style that map several FIDL identifiers to a single target language identifier. This could cause unexpected conflicts that aren't visible until particular languages are targeted.
This proposes introducing a constraint on FIDL identifiers that no existing libraries violate. It doesn't change the FIDL language, IR (yet ), bindings, style guide or rubric.
In practice, identifiers consist of a series of words that are joined together.
The common approaches for joining words are
CamelCase, where a transition
from lower to upper case is a word boundary, and
snake_case, where one or
many underscores (
_) are used to separate words.
Identifiers should be transformed to a canonical form for comparison.
This will be a
lower_snake_case form, preserving the word separation in the
Words are broken on transitions from lower-case or digit to upper-case and
where there are underscores.
In FIDL, identifiers must be used in their original form.
So if a type is named
FooBar, attempting to refer to it as
foo_bar is an error.
There is a simple algorithm to carry out this transformation, here in Python:
def canonical(identifier): last = '_' out = '' for c in identifier: if c == '_': if last != '_': out = out + '_' elif (last.islower() or last.isdigit()) and c.isupper(): out = out + '_' + c.lower() else: out = out + c.lower() last = c return out
Some examples, with their possible translation in various target languages:
|FIDL identifier||Canonical form||C++||Rust||Go||Dart|
The front-end compiler will be updated to check that each new identifier's canonical form does not conflict with any other identifier's canonical form.
The next version of the FIDL IR should be organized around canonical names rather than original names, but the original name will be available as a field on declarations. If we can eliminate the use of unmodified names in generated bindings then the original names can be dropped from the IR.
This codifies constraints on the FIDL language that exist in practice.
Documentation and examples
The FIDL language documentation would be updated to describe this constraint. It would be expanded to include much of what's in the Design section above.
Because this proposal simply encodes existing practice, examples and tutorials won't be affected.
Any existing FIDL libraries that would fall afoul of this change violate our style guides and won't work with many language bindings. This does not change the form of identifier that is used to calculate ordinals.
This imposes a negligible cost to the front-end compiler.
There will be extensive tests for the canonicalization algorithm
There will also be
fidlc tests to ensure that errors are caught when
conflicting identifiers are declared and to make sure that the original names
must be used to refer to declarations.
Drawbacks, alternatives, and unknowns
One option is to do nothing.
Generally we catch these issues as build failures in non-C++ generated bindings.
As Rust is used more in
fuchsia.git, the chance of conflicts slipping
through to other petals is lessened.
And these issues are already pretty rare.
The canonicalization algorithm is simple but has one unfortunate failure case
— mixed alphanumeric words in UPPER_SNAKE_CASE identifiers might be
This is because the algorithm treats digits as lower-case letters.
We have to break on digit-to-letter transitions because
Identifiers with no lower-case letters could be special cased — only
breaking on underscores — but that adds complexity to the algorithm and
perhaps to the mental model.
The canonical form could be expressed as a list of words rather than a lower_camel_case string. They're equivalent and in practice it's simpler to manage them as a string.
We could use identifiers' canonical form when generating ordinals. That would make this a breaking change for no obvious benefit. If there is an ordinal-breaking flag day in the future then we could consider that change then.
Initial rejection, and second review
During its first review on 4/18/2019, this FTP was rejected with the following rationale.
- Two opposing views on solving this class of problems.
- Work to model target languages' constraints to maintain as much
flexibility in FIDL as possible, even if that is different than the
That's the approach taken by this FTP.
- Pros: Keeps flexibility for eventual uses of FIDL beyond Fuchsia, more pure from a programming language standpoint.
- Cons: Scoping rules are more complex, style is not enforced, but encouraged (through linting for instance). Could lead to APIs built by partners that do not conform to the Fuchsia style guide we want (since they are not required to run, or adhere to linting).
- Enforce style constraints directly in the language, which eliminates the
class of problem.
- Pros: Style is enforced, developers are told how things ought to be, or it doesn't compile.
- Cons: ingrains stylistic choices in the language definition, higher hill to climb for novice developers using FIDL.
- → We rejected the proposal, and instead prefer an approach that directly enforces style in the language.
- → Next step here is a formal proposal to make this happen, and clarifies
all aspects of this (e.g., should
It was decided to overturn this decision based on the following observations:
The kernel API is now described FIDL. This pushed us to re-opened the identifier uniqueness issue last October, and we essentially overturned the decision, allowing both a 'C-style' and 'FIDL-style' to co-exist. This is checked by the fidl linter today.
We have seen other use cases pushing FIDL to generalize the transport, and with it possibly also local style rules to better support these domains.
Identifier clashes continue to be an issue, and modeling target language constraints is a well identified gap in the FIDL toolchain.
Prior art and references
In proto3 similar rules are applied to generate a
for JSON encoding.
until a new version of the IR schema which would likely carry names with additional structure, rather than the fully-qualified name as it exists today.