Hi Adam,

I truly appreciate your reply. 
I actually encountered this issue on real hardware rather than QEMU. 
May I ask if this problem could be related to the hardware itself? I’m not quite sure I fully understand.

Best regards
Stephen.yang




在 2025-09-30 15:50:22,"Adam Lackorzynski" <adam@l4re.org> 写道: >Hi Stephen, > >ok, thanks, that's tricky indeed. > >In case you are doing this with QEMU, could you please make sure you >have the following change in your QEMU?: https://lists.gnu.org/archive/html/qemu-devel/2024-09/msg02207.html > >Or do you see this on hardware? > > >Thanks, >Adam > >On Tue Sep 30, 2025 at 11:14:57 +0800, yy18513676366 wrote: >> Hi Adam, >> >> >> Thank you very much for your reply — it really gave me some hope. >> >> >> This issue is indeed difficult to reproduce reliably, which has been one of the main challenges during my debugging. >> So far, I have found that increasing the vtimer interrupt frequency, while keeping the traditional handling mode (i.e., without direct injection), >> makes the problem significantly easier to reproduce. >> >> >> The relevant changes are as follows. >> 1、In this setup, the vtimer is adjusted from roughly one trigger per millisecond to approximately one trigger per microsecond, >> and the system remains stable and functional: >> >> >> diff --git a/src/kern/arm/timer-arm-generic.cpp b/src/kern/arm/timer-arm-generic.cpp >> index a040cf46..b4cbbceb 100644 >> --- a/src/kern/arm/timer-arm-generic.cpp >> +++ b/src/kern/arm/timer-arm-generic.cpp >> @@ -64,7 +64,8 @@ void Timer::init(Cpu_number cpu) >> if (cpu == Cpu_number::boot_cpu()) >> { >> _freq0 = frequency(); >> - _interval = Unsigned64{_freq0} * Config::Scheduler_granularity / 1000000; >> + //_interval = Unsigned64{_freq0} * Config::Scheduler_granularity / 1000000; >> + _interval = Unsigned64{_freq0} * Config::Scheduler_granularity / 1000000000; >> printf("ARM generic timer: freq=%ld interval=%ld cnt=%lld\n", >> _freq0, _interval, Gtimer::counter()); >> assert(_freq0); >> >> >> 2、In addition, I selected the mode where interrupts are not directly injected: >> diff --git a/src/Kconfig b/src/Kconfig >> index 4391c996..55deeb1c 100644 >> --- a/src/Kconfig >> +++ b/src/Kconfig >> @@ -367,7 +367,7 @@ config IOMMU >> config IRQ_DIRECT_INJECT >> bool "Support direct interrupt forwarding to guests" >> depends on CPU_VIRT && HAS_IRQ_DIRECT_INJECT_OPTION >> - default y >> + default n >> help >> Adds support in the kernel to allow the VMM to let Fiasco directly >> forward hardware interrupts to a guest. This enables just the >> >> At the moment, this is the only way I have found that can noticeably increase the reproduction rate. >> Once again, thank you for your valuable time and feedback! >> >> Best regards, >> Stephen.yang >> >> >> >> >> At 2025-09-29 00:11:41, "Adam Lackorzynski" <adam@l4re.org> wrote: >> >Hi, >> > >> >On Wed Sep 17, 2025 at 13:57:43 +0800, yy18513676366 wrote: >> >> When running a virtual machine, I encounter an assertion failure after the VM >> >> has been up for some time. The kernel crashes in src/kern/arm/ >> >> thread-arm-hyp.cpp, specifically in the function vcpu_vgic_upcall(unsigned >> >> virq): >> >> >> >> vcpu_vgic_upcall(unsigned virq) >> >> { >> >> ...... >> >> assert(state() & Thread_vcpu_user); >> >> ...... >> >> } >> >> >> >> Based on source code inspection and preliminary debugging, the problem seems to >> >> be related to the management of the Thread_vcpu_user state. >> >> >> >> 1 Under normal circumstances, the vcpu_resume path (transitioning from the >> >> kernel back to the guest OS) updates the vCPU state to include >> >> Thread_vcpu_user. However, if an interrupt is delivered during this transition >> >> while the receiving side is not yet ready, the vCPU frequently return to the >> >> kernel (via vcpu_return_to_kernel) and subsequently process the interrupt >> >> through guest_irq in vcpu_entries. In this situation, the expected update of >> >> Thread_vcpu_user may not yet have taken place, which seems result in the assert >> >> being triggered when a VGIC interrupt is involved. >> >> >> >> 2 A similar condition seems to occur in the vcpu_async_ipc path. At the end >> >> of IPC handling, this function explicitly clears the Thread_vcpu_user flag. If >> >> a VGIC interrupt is delivered during this phase, the absence of the expected >> >> Thread_vcpu_user state seems to lead to the same assertion failure. >> >> >> >> I would like to confirm if the two points above are correct, and what steps I >> >> should take next to further debug this issue. >> > >> >Thanks for repording. At least the description sounds reasonable to me. >> > >> >Do you have a good way of reliably reproducing this situation? >> > >> >> In addition, I have some assumptions I would like to confirm: >> >> >> >> First, for IPC between non-vcpu threads, the L4 microkernel handles message >> >> delivery and scheduling (wake/schedule) directly, without requiring any >> >> forwarding through uvmm. Similarly, interrupts bound via the interrupt >> >> controller (ICU) to a non-vcpu thread or handler are also managed by the kernel >> >> and scheduler, and therefore do not necessarily involve uvmm. >> > >> >IPCs between threads are handled by the microkernel. vcpu-thread vs. >> >non-vcpu-thread is just making the difference regarding how it is >> >delivered to the thread. For a non-vcpu thread the receiver has to wait >> >in IPC to get it, in vcpu mode the IPC is received by causing a vcpu >> >event and bringing the vcpu to its entry. This also works without >> >virtualization (note that vcpus also work without hw-virtualization). >> >For interrupts it is the same. For non-vcpu threads they have to block >> >in IPC to get an interrupt, or for vcpu threads, they will be brought to >> >their entry. >> > >> >> Second, passthrough interrupts, when not delivered in direct-injection mode, >> >> are routed to uvmm for handling if they are bound to a vCPU. Likewise, services >> >> provided by uvmm (such as virq) are also bound to a vCPU and therefore require >> >> forwarding through uvmm. >> > >> >Yes. Direct injection will only happen when the vcpu is running. >> > >> >> There seems to have been a similar question in the past, but it does not seem >> >> to have been resolved. >> >> >> >> Re: Assertion failure error in kernel vgic interrupt processing - l4-hackers - >> >> OS Site >> >> >> >> I wonder if my questions are related to that post, and if any solutions exist. >> > >> >Thanks, we need to work on it. Reproducing this situation on our side >> >would be very valuable. >> > >> > >> >Thanks, Adam >_______________________________________________ >l4-hackers mailing list -- l4-hackers@os.inf.tu-dresden.de >To unsubscribe send an email to l4-hackers-leave@os.inf.tu-dresden.de