|RFC-0181: Lockless Discardable VMO|
Allow reads and writes to discardable VMOs that are in an unlocked stated.
|Date submitted (year-month-day)||2022-05-18|
|Date reviewed (year-month-day)||2022-08-03|
Lockless discardable VMOs are a sub-type of discardable VMOs where they are allowed to be accessed even if not locked. The VMO may still be discarded at any time, and users must be tolerant of sudden complete data loss. To improve usage in the motivating use case the VMO hinting operations are extended to also cover discardable VMOs.
Tracing is a workload that wants to have a minimal performance overhead and generates large bursts of data. Current discardable VMOs are not suitable for this due to the overhead of locking and unlocking. Using regular anonymous VMOs has the current downside that due to the bursts of data produced, tracing may cause the system to OOM.
The tracing engine is able to tolerate arbitrary data loss. That is, the tracing VMO can be arbitrarily decommitted, even mid read or write, without causing problems, beyond the loss of the trace data. Further, it is considered preferable for some trace data to be lost, than have the whole system OOM.
Allowing discardable VMOs to be used when unlocked solves all these problems since:
- System OOM will be avoided by tracing VMO contents being discarded.
- Writing to the trace VMOs incurs no additional overhead.
- Tracing is already tolerant of data loss and will perceive VMO discard as data loss.
Although the tracing system is tolerant of data loss, it is desirable to only discard tracing if needed to avoid OOM. There is no mechanism to control when the kernel will perform reclamation on discardable VMOs, or how it balances discarding VMOs versus other forms of reclamation. To improve the reliability of trace generation, without compromising OOM prevention, we therefore would like a way to inform the kernel to reclaim from these only if needed.
eieio@ and rashaeqbal@
Lockless discardable VMOs are a sub-type of regular discardable VMOs and shall
be created by passing the option
zx_vmo_create, instead of the regular
ZX_VMO_DISCARDABLE option. The created
discardable VMO still starts in the unlocked state, the only difference is that
whilst in the unlocked state reads and writes to the VMO will succeed instead of
The VMO range operations
have their API definitions extended to cover discardable VMOs, and not just
pager backed VMOs.
Although a lockless discardable can be used without being locked, it is still
allowed to be locked through the
ZX_VMO_OP_LOCK and related operations.
Locking behaves just like a regular discardable VMO, and while locked the
contents cannot be discarded.
The syscall flag of
ZX_VMO_DISCARDABLE_LOCKLESS needs to get passed down
through VMO creation and become a flag in the internal VMO.
The existing discardable VMOs implementation explicitly cause faults when the VMO is unlocked by the couple of relevant code paths checking for the discardable state, and generating an error if unlocked. These checks just need to be extended to then not generate the fault if it is a LOCKLESS variant.
Movement of discardable VMOs on the internal discardable lists also needs to be changed slightly. Since lockless VMO might always have pages to discard, and not just if they have been locked since last discarded, they need to be left on the discardable list after being discarded.
For reclamation changes the range for
ZX_VMO_OP_ALWAYS_NEED will be ignored
and always promoted to full VMO. Separate discardable list will be created, with
hinted VMOs being placed on it. This list will be evicted from under ALMOST_OOM
- Immediately discard the VMO if it's unlocked
- Move the VMO to the regular discard list if it wasn't already.
Lockless discardable VMOs will not require the ZX_VM_ALLOW_FAULTS flag when mapped.
This could all be done as a single small CL to fuchsia.git with no other dependencies.
No performance impact is expected to VMO operations, with a single additional boolean check being added to what are already slowpath scenarios.
Tests will be added to the Zircon core tests suite.
Relevant syscall docs need to be updated:
zx_vmo_op_range- Update statements about VMO type supported for
zx_vmar_map- Update requirements of
ZX_VM_ALLOW_FAULTSfor lockless discardable VMOs.
Drawbacks, alternatives, and unknowns
The original discardable RFC describes an optimizing using atomics that could provide the majority of the desired performance increase. There are two drawbacks to this approach:
- An atomic API between kernel and userspace is untrodden ground and would be a complicated space to explore, which is why it was not done in the original proposal.
- Locking pages, even if its efficient, is still not needed and actually goes against our memory desires of these VMOs to be discardable absolutely anytime.
Prior art and references