Re:Re: Assertion ¡®state() & Thread_vcpu_user' in kernel vgic interrupt processing

13 Oct 2025

Hi Adam,

We’re using the Radxa SiRider S1 development board, which is based on the Rockchip RK3588 SoC.
You can find more details here: https://arace.tech/products/radxa-sirider-s1.
The CPU consists of 4 Cortex-A76 cores and 2 Cortex-A55 cores.

Best,
Stephen.yang





At 2025-10-06 06:00:44, "Adam Lackorzynski" <adam@l4re.org> wrote:
...
Hi Stephen,
I doubt this is hardware, it seldomly is. Would you be able to share
which Arm core it is if you can?
I'll try to reproduce here, your indication of increasing the timer
frequency is a good hint. And knowing which core or at least category of
core can be helpful.
BR, Adam
On Thu Oct 02, 2025 at 13:02:50 +0800, yy18513676366 wrote:
...
Hi Adam,
I truly appreciate your reply. 
I actually encountered this issue on real hardware rather than QEMU. 
May I ask if this problem could be related to the hardware itself? I’m not quite sure I fully understand.
Best regards
Stephen.yang
在 2025-09-30 15:50:22，"Adam Lackorzynski" <adam@l4re.org> 写道：
...
Hi Stephen,
ok, thanks, that's tricky indeed.
In case you are doing this with QEMU, could you please make sure you
have the following change in your QEMU?: https://lists.gnu.org/archive/html/qemu-devel/2024-09/msg02207.html
Or do you see this on hardware?
Thanks,
Adam
On Tue Sep 30, 2025 at 11:14:57 +0800, yy18513676366 wrote:
...
Hi Adam,
Thank you very much for your reply — it really gave me some hope.
This issue is indeed difficult to reproduce reliably, which has been one of the main challenges during my debugging.
So far, I have found that increasing the vtimer interrupt frequency, while keeping the traditional handling mode (i.e., without direct injection),
makes the problem significantly easier to reproduce.
The relevant changes are as follows. 
1、In this setup, the vtimer is adjusted from roughly one trigger per millisecond to approximately one trigger per microsecond, 
and the system remains stable and functional:

diff --git a/src/kern/arm/timer-arm-generic.cpp b/src/kern/arm/timer-arm-generic.cpp
index a040cf46..b4cbbceb 100644
--- a/src/kern/arm/timer-arm-generic.cpp
+++ b/src/kern/arm/timer-arm-generic.cpp
@@ -64,7 +64,8 @@ void Timer::init(Cpu_number cpu)
   if (cpu == Cpu_number::boot_cpu())
     {
       _freq0 = frequency();
-      _interval = Unsigned64{_freq0} * Config::Scheduler_granularity / 1000000;
+      //_interval = Unsigned64{_freq0} * Config::Scheduler_granularity / 1000000;
+      _interval = Unsigned64{_freq0} * Config::Scheduler_granularity / 1000000000;
       printf("ARM generic timer: freq=%ld interval=%ld cnt=%lld\n",
              _freq0, _interval, Gtimer::counter());
       assert(_freq0);
2、In addition, I selected the mode where interrupts are not directly injected:
diff --git a/src/Kconfig b/src/Kconfig
index 4391c996..55deeb1c 100644
--- a/src/Kconfig
+++ b/src/Kconfig
@@ -367,7 +367,7 @@ config IOMMU
 config IRQ_DIRECT_INJECT
        bool "Support direct interrupt forwarding to guests"
        depends on CPU_VIRT && HAS_IRQ_DIRECT_INJECT_OPTION
-       default y
+      default n
        help
          Adds support in the kernel to allow the VMM to let Fiasco directly
          forward hardware interrupts to a guest. This enables just the
At the moment, this is the only way I have found that can noticeably increase the reproduction rate.
Once again, thank you for your valuable time and feedback!
Best regards,
Stephen.yang
At 2025-09-29 00:11:41, "Adam Lackorzynski" <adam@l4re.org> wrote:
...
Hi,
On Wed Sep 17, 2025 at 13:57:43 +0800, yy18513676366 wrote:
...
When running a virtual machine, I encounter an assertion failure after the VM
has been up for some time. The kernel crashes in src/kern/arm/
thread-arm-hyp.cpp, specifically in the function vcpu_vgic_upcall(unsigned
virq):
vcpu_vgic_upcall(unsigned virq)
{
   ......
   assert(state() & Thread_vcpu_user);
   ......
}
Based on source code inspection and preliminary debugging, the problem seems to
be related to the management of the Thread_vcpu_user state.
1  Under normal circumstances, the vcpu_resume path (transitioning from the
kernel back to the guest OS) updates the vCPU state to include
Thread_vcpu_user. However, if an interrupt is delivered during this transition
while the receiving side is not yet ready, the vCPU frequently return to the
kernel (via vcpu_return_to_kernel) and subsequently process the interrupt
through guest_irq in vcpu_entries. In this situation, the expected update of
Thread_vcpu_user may not yet have taken place, which seems result in the assert
being triggered when a VGIC interrupt is involved.
2  A similar condition seems to occur in the vcpu_async_ipc path. At the end
of IPC handling, this function explicitly clears the Thread_vcpu_user flag. If
a VGIC interrupt is delivered during this phase, the absence of the expected
Thread_vcpu_user state seems to lead to the same assertion failure.
I would like to confirm if the two points above are correct, and what steps I
should take next to further debug this issue.
Thanks for repording. At least the description sounds reasonable to me.
Do you have a good way of reliably reproducing this situation?
...
In addition, I have some assumptions I would like to confirm:
First, for IPC between non-vcpu threads, the L4 microkernel handles message
delivery and scheduling (wake/schedule) directly, without requiring any
forwarding through uvmm. Similarly, interrupts bound via the interrupt
controller (ICU) to a non-vcpu thread or handler are also managed by the kernel
and scheduler, and therefore do not necessarily involve uvmm.
IPCs between threads are handled by the microkernel. vcpu-thread vs.
non-vcpu-thread is just making the difference regarding how it is
delivered to the thread. For a non-vcpu thread the receiver has to wait
in IPC to get it, in vcpu mode the IPC is received by causing a vcpu
event and bringing the vcpu to its entry. This also works without
virtualization (note that vcpus also work without hw-virtualization).
For interrupts it is the same. For non-vcpu threads they have to block
in IPC to get an interrupt, or for vcpu threads, they will be brought to
their entry.
...
Second, passthrough interrupts, when not delivered in direct-injection mode,
are routed to uvmm for handling if they are bound to a vCPU. Likewise, services
provided by uvmm (such as virq) are also bound to a vCPU and therefore require
forwarding through uvmm.
Yes. Direct injection will only happen when the vcpu is running.
...
There seems to have been a similar question in the past, but it does not seem
to have been resolved.
Re: Assertion failure error in kernel vgic interrupt processing - l4-hackers -
OS Site
I wonder if my questions are related to that post, and if any solutions exist.
Thanks, we need to work on it. Reproducing this situation on our side
would be very valuable.
Thanks, Adam

l4-hackers mailing list -- l4-hackers@os.inf.tu-dresden.de
To unsubscribe send an email to l4-hackers-leave@os.inf.tu-dresden.de

l4-hackers mailing list -- l4-hackers@os.inf.tu-dresden.de
To unsubscribe send an email to l4-hackers-leave@os.inf.tu-dresden.de