Hi l4-hackers,


When running a virtual machine, I encounter an assertion failure after the VM has been up for some time. The kernel crashes in src/kern/arm/thread-arm-hyp.cpp, specifically in the function vcpu_vgic_upcall(unsigned virq):

 

vcpu_vgic_upcall(unsigned virq)

{

   ......

   assert(state() & Thread_vcpu_user);

   ......

}

 

Based on source code inspection and preliminary debugging, the problem seems to be related to the management of the Thread_vcpu_user state.

 

(1)Under normal circumstances, the vcpu_resume path (transitioning from the kernel back to the guest OS) updates the vCPU state to include Thread_vcpu_user. However, if an interrupt is delivered during this transition while the receiving side is not yet ready, the vCPU frequently return to the kernel (via vcpu_return_to_kernel) and subsequently process the interrupt through guest_irq in vcpu_entries. In this situation, the expected update of Thread_vcpu_user may not yet have taken place, which seems result in the assert being triggered when a VGIC interrupt is involved.

 

(2)A similar condition seems to occur in the vcpu_async_ipc path. At the end of IPC handling, this function explicitly clears the Thread_vcpu_user flag. If a VGIC interrupt is delivered during this phase, the absence of the expected Thread_vcpu_user state seems to lead to the same assertion failure.

 

I would like to confirm if the two points above are correct, and what steps I should take next to further debug this issue.

 

In addition, I have some assumptions I would like to confirm:

First, for IPC between non-vcpu threads, the L4 microkernel handles message delivery and scheduling (wake/schedule) directly, without requiring any forwarding through uvmm. Similarly, interrupts bound via the interrupt controller (ICU) to a non-vcpu thread or handler are also managed by the kernel and scheduler, and therefore do not necessarily involve uvmm.

 

Second, passthrough interrupts, when not delivered in direct-injection mode, are routed to uvmm for handling if they are bound to a vCPU. Likewise, services provided by uvmm (such as virq) are also bound to a vCPU and therefore require forwarding through uvmm.

 

There seems to have been a similar question in the past, but it does not seem to have been resolved.

Re: Assertion failure error in kernel vgic interrupt processing - l4-hackers - OS Site

I wonder if my questions are related to that post, and if any solutions exist.

 

Thanks!