Hello,
After I ported DDE Linux26 to the Hurd, I test it with a NIC driver: pcnet32, and see a problem: device interrupt of the NIC device is sometimes masked but cannot be unmasked.
Whenever pcnet32 driver receives an interrupt, it masks device interrupts and calls __netif_rx_schedule() and let softirq to handle the interrupt. __netif_rx_schedule() should set NET_RX_SOFTIRQ, but it can only do that when the local "irq" is disabled (by calling local_irq_save macro). Linux disables irq with cli instruction. Obviously DDE cannot do that, but the implementation of local_irq_save in DDE is quite strange. It seems that it eventually calls raw_local_irq_disable(), which is implemented in linux26/lib/src/arch/l4/cli_sti.c. How can increasing _refcnt has anything to do with disabling irq?
Without disabling irq, there is a race condition in the interrupt handler and softirq handler. When I run pcnet32 with my ported DDE Linux26 for a long time, I sometimes see softirq fails to be scheduled after the driver receives a hard IRQ.
I don't know how the Linux drivers can work with DDE Linux in L4. raw_local_irq_disable() apparently has problems if I read the right code.
Best regards, Zheng Da
Hi,
the DDE models a SMP-like setup, whereas each ddekit_thread is supposed to run on a dedicated CPU. For each IRQ, there is a dedicated ddekit_thread. As far as I understand it, disabling hard IRQs in any other ddekit_thread than the IRQ-handler threads has no effect, because they won't receive IRQs anyway. For an IRQ handler thread, it also has no effect, because it only runs when it is handling an interrupt, and won't receive any further IRQs while handling one.
Best regards,
Dirk.
Hi,
On 10-2-26 下午9:17, Dirk Vogt wrote:
Hi,
the DDE models a SMP-like setup, whereas each ddekit_thread is supposed to run on a dedicated CPU. For each IRQ, there is a dedicated ddekit_thread. As far as I understand it, disabling hard IRQs in any other ddekit_thread than the IRQ-handler threads has no effect, because they won't receive IRQs anyway. For an IRQ handler thread, it also has no effect, because it only runs when it is handling an interrupt, and won't receive any further IRQs while handling one.
I was talking about synchronization between hard IRQ handler and softirq handler.
When the driver receives a hard IRQ, it tries to raise softirq. local_irq_save(flags); list_add_tail(&n->poll_list, &__get_cpu_var(softnet_data).poll_list); __raise_softirq_irqoff(NET_RX_SOFTIRQ); local_irq_restore(flags); The softirq thread, on the other hand, does local_irq_save(flags); if (local_softirq_pending()) __do_softirq(); local_irq_restore(flags); In __do_softirq, it does unsigned long pending = local_softirq_pending();
/* reset softirq count */ set_softirq_pending(0);
In Linux local_irq_save() disables irqs in the local processor, so if the hard IRQ handler tries to raise softirq, it is guaranteed that the softirq thread will not be scheduled to run, and vice versa. Since DDE doesn't disable IRQs in the local processor, there is a possibility that __do_softirq gets the pending softirq and then the cpu schedules to run the hard IRQ interrupt, which raise a softirq, and then the cpu is scheduled to run the softirq thread again and runs set_softirq_pending(0). The raised irq is just lost.
Unless I am mistaken, I think it is what causes my problem.
Best regards, Zheng Da
On Fri, 2010-02-26 at 22:08 +0800, Da Zheng wrote:
In Linux local_irq_save() disables irqs in the local processor, so if the hard IRQ handler tries to raise softirq, it is guaranteed that the softirq thread will not be scheduled to run, and vice versa. How would that work on a SMP machine?
Correct me if I am wrong, but i think even on native Linux the hard IRQ handler and the soft IRQ handler could run on the same time (on two different processors) as only *local* interrupts are disabled.
[...] 2.3.43 introduced softirqs, and re-implemented the (now deprecated) BHs underneath them. Softirqs are fully-SMP versions of BHs: they can run on as many CPUs at once as required. This means they need to deal with any races in shared data using their own locks. [...]
[0] http://people.netfilter.org/rusty/unreliable-guides/kernel-hacking/basics-so...
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Dirk Vogt wrote:
On Fri, 2010-02-26 at 22:08 +0800, Da Zheng wrote:
In Linux local_irq_save() disables irqs in the local processor, so if the hard IRQ handler tries to raise softirq, it is guaranteed that the softirq thread will not be scheduled to run, and vice versa. How would that work on a SMP machine?
Correct me if I am wrong, but i think even on native Linux the hard IRQ handler and the soft IRQ handler could run on the same time (on two different processors) as only *local* interrupts are disabled.
Christian just pointed out that the raised softirq is required to run on the CPU the Hard-IRQ was raised on, although I can't find a resource on that right now.
So, if this is the case, Da Zheng might have found a lingering bug. Thanks, we'll check that.
Bjoern
On 10-2-26 下午11:50, Dirk Vogt wrote:
On Fri, 2010-02-26 at 22:08 +0800, Da Zheng wrote:
In Linux local_irq_save() disables irqs in the local processor, so if the hard IRQ handler tries to raise softirq, it is guaranteed that the softirq thread will not be scheduled to run, and vice versa. How would that work on a SMP machine?
Correct me if I am wrong, but i think even on native Linux the hard IRQ handler and the soft IRQ handler could run on the same time (on two different processors) as only *local* interrupts are disabled.
[...] 2.3.43 introduced softirqs, and re-implemented the (now deprecated) BHs underneath them. Softirqs are fully-SMP versions of BHs: they can run on as many CPUs at once as required. This means they need to deal with any races in shared data using their own locks. [...]
[0] http://people.netfilter.org/rusty/unreliable-guides/kernel-hacking/basics-so...
It seems that I missed a letter.
The same type of softirq can run on more than one processor at the same time, but when a softirq is activated, it should be executed on the same CPU. That's why each CPU has a 32-bit mask describing the pending softirqs in the local CPU. Please check the section 4.7 of the book Understanding the Linux Kernel, the third edition.
Best regards, Zheng Da
l4-hackers@os.inf.tu-dresden.de