RFC-0270: Zircon Virtual Interrupt Signals | |
---|---|
Status | Accepted |
Areas |
|
Description | Proposes a new signal for virtual interrupt objects. |
Issues | |
Gerrit change | |
Authors | |
Reviewers | |
Date submitted (year-month-day) | 2025-04-02 |
Date reviewed (year-month-day) | 2025-04-16 |
Problem Statement
We need a better way to demultiplex shared interrupts and manage the acknowledged/unacknowledged state of virtual interrupts. See Extended Problem Statement section below for more detail.
Summary
This RFC proposes adding a new Zircon signal to virtual interrupt objects that remains asserted when the virtual interrupt object is in the "untriggered" state.
Background
Physical vs. Virtual Interrupts
Zircon interrupt objects come in two flavors, Physical interrupts, and Virtual interrupts.
Physical interrupt objects logically represent an interrupt signal delivered
through the system's interrupt controller, sometimes via a PCIe bus, and other
times directly to the controller via platform or SoC specific routing. In
contrast, virtual interrupt objects do not directly represent signals which
originate from non-CPU hardware. Instead, virtual interrupts are created by
programs in order to look like a physical interrupt, however they can only
become signaled when software calls zx_interrupt_trigger
on the object.
Custom waiting interfaces
Interrupt objects do not currently use the standard Zircon signaling system in
order to notify a client that an interrupt has been received. While they can be
bound to a port and deliver a port packet when an IRQ is received, and they can
be used to block a thread until an IRQ is received, they do not use the standard
zx_object_wait_one
/zx_object_wait_many
or zx_object_wait_async
syscalls
to achieve this behavior. Instead, users must use zx_interrupt_bind
to bind
an interrupt object to a port object if they wish to receive port packets, and
must use zx_interrupt_wait
if they want to block a thread. Only one logical
wait operation can be pending on an interrupt object at a time, arbitrary
numbers of wait operations are not supported. Once a thread has unblocked or a
port packet has been delivered, an interrupt needs to be "acknowledged" before
it can become signaled again.
When users are waiting for interrupts via zx_interrupt_wait
, an interrupt
becomes acknowledged the next time the user's thread blocks on the interrupt by
calling zx_interrupt_wait
again. When users bind their interrupt to a port
via zx_interrupt_bind
, they can acknowledge the interrupt by calling
zx_interrupt_ack
on the interrupt object, allowing a new port packet to be
delivered.
While this non-standard signaling interface is not the best thing in the world, and there is some desire to design a new interface which is compatible with the standard Zircon signaling machinery (in addition to addressing other pain points), a complete redesign of the Zircon interrupt APIs and transition to a new system would be a complicated undertaking, likely to take a significant amount of time and effort and is outside the limited scope of this proposal.
The Purpose of Virtual Interrupts
Virtual Interrupts serve two primary purposes in Fuchsia, both stemming from the same motivation. These are testing, and demultiplexing.
Because drivers need to use the special waiting interfaces used by interrupt objects, it is difficult to easily substitute a different kernel object (perhaps an Event) for an Interrupt object and have the driver code "just work". This presents an issue when writing driver tests where the test environment wants to mock hardware. There is no physical interrupt that can be provided to the driver, and even if there was, controlling such a thing for any given test is likely to be difficult and very target specific.
Instead, tests can use a Virtual Interrupt object. From a driver consumer's point of view, they use the same API and are basically indistinguishable allowing the code to be tested without needing any special case code which is aware of the difference between a testing environment and a real environment.
A similar situation can arise when an individual interrupt needs to be demultiplexed. Consider a driver for an external chip which needs to raise an interrupt. We'd like to have a single driver for the external chip, independent of the specific SoC used in a particular product which makes use of the external chip. The chip may have an IRQ signal which is connected to the SoC. This IRQ signal in some SoC designs might be routed directly to the SoC's interrupt controller, meaning that a dedicated physical interrupt can be instantiated and passed to the driver during initialization. Provided that managing the interrupt only requires interaction with the system's interrupt controller, the kernel can directly manage the IRQ via the physical interrupt object.
On the other hand, other designs might connect this interrupt to a GPIO as part of a block of interrupts (typically in blocks of 16 or 32). When any of the GPIOs in the block configured as an interrupt becomes signaled, the physical interrupt for the GPIO hardware is raised. The GPIO driver can then wake and figure out which of the multiple GPIO interrupts has fired, and mask the interrupt until it can be serviced. It still needs to signal the driver whose interrupt this is, however. We don't want the irq-consumer-driver to need to know the difference between these situations, nor do we want multiple drivers for different external chips to need to exist in the same process simply because the board design routed multiple interrupts to the same GPIO block. So, in situations like this, the GPIO driver can create virtual interrupt objects for each of the GPIOs configured as interrupts in the shared hardware, effectively demultiplexing multiple interrupts which share the same physical interrupt, and without requiring the driver to have code to handle the two situations differently.
Extended Problem Statement
As of today, however, there is a practical problem with this approach. Consider
the GPIO demultiplexing scenario. There are multiple GPIO interrupts sharing
the same block of HW, and the driver has created and distributed one virtual
interrupt for each of them. When one of the interrupts fires, the GPIO is
signaled via the shared physical interrupt for the block. The GPIO driver can
then mask the specific bit using the SoC specific GPIO hardware, and
zx_interrupt_trigger
the associated virtual interrupt. Now that all of the
pending GPIO interrupts have been signaled, it can acknowledge its own physical
interrupt, and go back to waiting for a new interrupt to be asserted.
Over on the driver/consumer side of things, the interrupt handler is woken, the interrupt is serviced, and finally acknowledged. At this point, the GPIO driver needs to take action. Specifically, it needs to wake up, and unmask/re-arm the specific bit in the GPIO block before any new interrupt can be signaled.
Currently, however, there is no good way to do this. No signal is sent via the interrupt object from the client driver to the GPIO driver, however without such a thing, the GPIO driver has no good way to know when the interrupt can be safely unmasked and re-armed.
Additional objects could be introduced, or protocols established, to provide such a signal, but this would then require the consumer of the interrupt to know that this is a virtual interrupt instead of a physical interrupt, undermining the goal of keeping the two scenarios effectively indistinguishable.
Stakeholders
Facilitator: cpu@google.com
Reviewers: + 'maniscalco@google.com' + 'bradenkell@google.com' + 'rudymathu@google.com' + 'cpu@google.com'
Consulted:
Zircon driver framework team.
Socialization:
This proposal was socialized (in Google Doc form) with the Zircon team.
Requirements
"Downstream" drivers that make use of an interrupt object must be agnostic of whether it's a virtual one or not.
Design
Interrupt objects and standard Zircon signals
As noted before, Zircon interrupt objects do not currently use the standard
Zircon signaling system for the purpose of signaling an interrupt request from
hardware. This does not mean that they don't participate in the standard
signaling system at all. Users are free to post async waits to interrupt
objects, or to block threads on them using
zx_object_wait_one
/zx_object_wait_many
. This said, the only way that any of
these wait operations will ever be satisfied is if it was waiting for one of the
8 "user signals" present on practically every Zircon object, and software
asserted that signal. There are no currently defined "system signals" for
interrupt objects for the kernel to signal.
ZX_VIRTUAL_INTERRUPT_UNTRIGGERED
So, let's introduce a new signal (call it ZX_VIRTUAL_INTERRUPT_UNTRIGGERED
),
and use it only for virtual interrupts. Here are the basic properties of the
signal:
- Virtual interrupt objects are not initially triggered after creation, so this signal would be asserted on any newly created virtual interrupt object.
- When a user calls
zx_interrupt_trigger
on a virtual interrupt object, the object enters the triggered state and the signal is de-asserted. - When a user acknowledges a triggered virtual interrupt object, the object enters the untriggered state, and the signal is asserted.
A practical example of usage
Consider a GPIO driver using a physical interrupt I
, and exposing three
virtual interrupts, A
, B
, and C
, all in the same GPIO block.
- At startup, the driver creates and distributes
A
,B
, andC
, then bindsI
to a portP
. - It also creates a dedicated IRQ thread (
T
), configures an appropriate profile, and then blocksT
inP
waiting for a packet. I
fires delivering a packet toP
, unblockingT
in the process.T
examines the GPIO state and discovers that the GPIOs associated withA
andC
are asserted.T
callszx_interrupt_trigger
onA
andC
.A
andC
enter the triggered state, clearingZX_VIRTUAL_INTERRUPT_UNTRIGGERED
in the process.T
callszx_object_async_wait
onA
andC
, associating them with its port and waiting forZX_VIRTUAL_INTERRUPT_UNTRIGGERED
to become asserted on the virtual interrupt objects.T
masks the GPIOs associated withA
andC
.T
callszx_interrupt_ack
onI
, then blocks on the port again, waiting for a new packet.- At some point in time, the driver who consumes
C
wakes up, services the interrupt, and then callszx_interrupt_ack
on its handle toC
, assertingZX_VIRTUAL_INTERRUPT_UNTRIGGERED
as a side effect. - The async wait operation waiting for
C
's untriggered signal is satisfied, so a port packet is generated and delivered toP
, unblockingT
. T
wakes, and notices thatC
has become "untriggered", so it unmasks and re-arms the GPIO interrupt associated withC
using its hardware specific registers.T
simply returns to waiting on its port. We will receive a port packet when eitherI
becomes asserted again, or whenA
'sZX_VIRTUAL_INTERRUPT_UNTRIGGERED
signal becomes asserted. No further action is required right now.
Acknowledging a pending interrupt
For some interrupt objects, it is possible for an interrupt to be triggered, and also have a second IRQ already pending at the time of acknowledgement. While this is most typical with MSI interrupts, it is possible to generate this situation with virtual interrupt objects by simply calling trigger twice in a row before a client manages to service the interrupt.
When a user acknowledges an interrupt which has another IRQ pending, the result
is that the interrupt object effectively fires immediately. A thread ack'ing
via zx_interrupt_wait
will immediately unblock. A thread ack'ing via
zx_interrupt_ack
will cause a new port packet to be immediately generated.
What happens with the UNTRIGGERED
signal? Since the interrupt logically
became un-triggered, then immediately re-triggered, the behavior of the signal
should be a strobing behavior. This is to say that any threads currently
waiting for the UNTRIGGERED
signal should unblock, and any currently posted
async wait operations should be satisfied and produce a port packet. The signal
state immediately following the acknowledgment operation, however, will be
de-asserted.
Implementation
Implementation of this approach is expected to be straightforward and low risk.
Kernel code currently has two dispatchers for interrupt objects,
InterruptDispatcher
and VirtualInterruptDispatcher
, the second being a
subclass of the first. While they share a lot of code, the paths for becoming
signaled are separate. InterruptDispatcher
s become signaled at hard IRQ time
via the InterruptHandler
method, while virtual interrupts are signaled at
zx_interrupt_trigger
ed time via the Trigger
method. It is not legal to call
Trigger
on an InterruptDispatcher
, doing so will result in a BAD_STATE
error. So, triggering is an operation only performed on virtual interrupt
objects, and always during a syscall context where it is legal to obtain the
locks required to manipulate object signal state.
Acknowledging the interrupt happens either when the next call to
zx_interrupt_wait
is made (for non-port bound interrupts), or when
zx_interrupt_ack
is called (for port bound interrupts). Again, as with
zx_interrupt_trigger
, the call is made in a syscall context allowing for
signal manipulation.
Finally, interrupts become destroyed either when the last handle to them is
closed, or when a thread calls zx_interrupt_destroy
. Again, all of this
happens in a context where kernel mutexes may be obtained, and user signals
asserted, making it simple to assert the untriggered signal (unblocking any
waiters) during explicit destruction of the object.
So, implementation of the feature should consist of simply overloading the
VirtualInterruptDispatcher
's behavior during these four operations: trigger,
ack, wait, and destroy. The overloaded versions may obtain the dispatcher lock
before performing the operation itself, updating the signal state as needed at
the end of the operation.
That's it. All of the complicated heavy lifting of unblocking threads and
delivering port packets is already taken care of by UpdateState
and the
existing standard Zircon signaling machinery. Physical interrupt objects will
not expose the signal nor use any of the virtual dispatcher code, sidestepping
the need to synchronize any signal state with operations performed at hard IRQ
time.
Ergonomics
The ergonomic of this are expected to be simple and acceptable. No changes need to be made to existing interrupt consumers, the ack-signaling behavior is completely automatic. Virtual interrupt producers simply have to make use of a signal using existing well established signal handling patterns.
Security considerations
The addition of this feature effectively introduces a new communications channel from the interrupt consumer to the interrupt producer, allowing the producer to know when the consumer has acknowledged a virtual interrupt. This is part of the requirements for a system of multiplexed interrupt to function at all, and is not considered to be a separate or undesirable side channel, and therefore not a security risk.
Privacy considerations
No user data or otherwise potentially sensitive information is being handled by this feature, there are no known privacy consideration which need to be dealt with.
Testing
Testing should be a straightforward extension of existing unit tests. We basically need to add tests to ensure that:
- When a virtual interrupt object is created, the
ZX_VIRTUAL_INTERRUPT_UNTRIGGERED
signal is initially asserted. - The signal is cleared in response to a call to
zx_interrupt_trigger
. - The signal is set in response to a call to
zx_interrupt_ack
. - The signal properly implements "strobe" behavior when it is acknowledged while there exists another pending IRQ.
Documentation
Documentation for virtual interrupt objects will be updated to describe the new signal, state that it is only used with virtual interrupt objects, and summarize the behavior rules stated above (initially asserted, cleared on trigger, set on ack, strobe on pending ack).
Drawbacks, alternatives, and unknowns
There are no know significant drawbacks to this approach. User requirements are satisfied. Implementation and testing should be simple, and low risk.
A number of alternative proposals have been discussed by the Zircon driver team, but all of them possess the same drawback. Driver consumers of Virtual Interrupts must become explicitly aware of the fact that they are using a Virtual interrupt, and then explicitly participate in some user-level defined protocol to signal acknowledgment to the interrupt producer.
While all of these approaches are possible, they seem undesirable. They add a significant extra burden on driver authors to participate in whatever protocol has been defined when a virtual interrupt is being used, and to behave differently in the case that a physical interrupt is being used. Additionally, there is already a pre-existing population of drivers which would need to be updated to use the new protocol, something which could be skipped entirely if the kernel handled the ack signal in a transparent fashion.
No matter what protocol gets implemented, it would be implemented entirely in user-mode, introducing the need to carefully consider trust issues stemming from potentially malicious actors. Making the signaling mechanism mediated by the Zircon kernel helps to simplify the analysis here.
Finally, users of Zircon interrupts already have to acknowledge their interrupts (either implicitly or explicitly). It is difficult imagine how any extra out-of-band ack protocol could end up being as efficient as simply manipulating the signal bits of a kernel object at the time that the existing ack'ing takes place.