Hi,
On Wed Sep 17, 2025 at 13:57:43 +0800, yy18513676366 wrote:
When running a virtual machine, I encounter an assertion failure after the VM has been up for some time. The kernel crashes in src/kern/arm/ thread-arm-hyp.cpp, specifically in the function vcpu_vgic_upcall(unsigned virq):
vcpu_vgic_upcall(unsigned virq) { ...... assert(state() & Thread_vcpu_user); ...... }
Based on source code inspection and preliminary debugging, the problem seems to be related to the management of the Thread_vcpu_user state.
1 Under normal circumstances, the vcpu_resume path (transitioning from the kernel back to the guest OS) updates the vCPU state to include Thread_vcpu_user. However, if an interrupt is delivered during this transition while the receiving side is not yet ready, the vCPU frequently return to the kernel (via vcpu_return_to_kernel) and subsequently process the interrupt through guest_irq in vcpu_entries. In this situation, the expected update of Thread_vcpu_user may not yet have taken place, which seems result in the assert being triggered when a VGIC interrupt is involved.
2 A similar condition seems to occur in the vcpu_async_ipc path. At the end of IPC handling, this function explicitly clears the Thread_vcpu_user flag. If a VGIC interrupt is delivered during this phase, the absence of the expected Thread_vcpu_user state seems to lead to the same assertion failure.
I would like to confirm if the two points above are correct, and what steps I should take next to further debug this issue.
Thanks for repording. At least the description sounds reasonable to me.
Do you have a good way of reliably reproducing this situation?
In addition, I have some assumptions I would like to confirm:
First, for IPC between non-vcpu threads, the L4 microkernel handles message delivery and scheduling (wake/schedule) directly, without requiring any forwarding through uvmm. Similarly, interrupts bound via the interrupt controller (ICU) to a non-vcpu thread or handler are also managed by the kernel and scheduler, and therefore do not necessarily involve uvmm.
IPCs between threads are handled by the microkernel. vcpu-thread vs. non-vcpu-thread is just making the difference regarding how it is delivered to the thread. For a non-vcpu thread the receiver has to wait in IPC to get it, in vcpu mode the IPC is received by causing a vcpu event and bringing the vcpu to its entry. This also works without virtualization (note that vcpus also work without hw-virtualization). For interrupts it is the same. For non-vcpu threads they have to block in IPC to get an interrupt, or for vcpu threads, they will be brought to their entry.
Second, passthrough interrupts, when not delivered in direct-injection mode, are routed to uvmm for handling if they are bound to a vCPU. Likewise, services provided by uvmm (such as virq) are also bound to a vCPU and therefore require forwarding through uvmm.
Yes. Direct injection will only happen when the vcpu is running.
There seems to have been a similar question in the past, but it does not seem to have been resolved.
Re: Assertion failure error in kernel vgic interrupt processing - l4-hackers - OS Site
I wonder if my questions are related to that post, and if any solutions exist.
Thanks, we need to work on it. Reproducing this situation on our side would be very valuable.
Thanks, Adam