Hi,
On 10-2-28 下午8:57, Christian Helmuth wrote:
You're right. There is indeed a problem, which we haven't stumbled over yet. That is, because the irq handler runs at at a higher priority in our ddekit implementation. However on a multiprocessor machine this
I don't understand. How can using a higher priority in the irq handler solve/avoid the problem? unless you mean running the softirq thread in a higher priority. Otherwise, the softirq thread can still be preempted and lose raised softirq as I mentioned in the previous letter.
You're right, priorities would not help on a multi-processor platform. But these scenarios are currently not supported anyway, so let's stay with one CPU. In this case DDE Linux works like the good old BSD4.4, in which the kernel execution contexts were assigned to priority levels. The interrupts on the higher levels and the user contexts on the lower ones. This scheme guarantees that interrupt contexts always finish their execution before other lower-priority contexts are executed. Linux adopted this in a way.
I don't think it can work on one CPU either. The problem is caused when the softirq thread is preempted by the interrupt thread. When you set the interrupt thread a higher priority, it's more likely the softirq thread gets preempted. softirq queue manipulation must be atomic.
I am thinking of using a lock to simulate cli/sti. When local_irq_enable() or local_irq_save() is called, we hold the lock; when local_irq_restore() or local_irq_disable(), we release the lock. We can even provide nested locking support if local_irq_enable() or local_irq_save() is allowed to be called multiple times in the same thread.
I think this addresses our issue in a way: Prevent handling of the just registered softirq until the interrupt handler returns.
What if the softirq thread is running when an interrupt comes? Since we cannot disable the hard irq, it is very likely to happen.
Afterwards, the registered softirq can be handled from the standard softirq thread as long as the queue manipulation is atomic. The issue is now near the top of our agenda for DDE.
If you only consider about handling the softirq queue, I think it's easy, simply using a lock to protect it explicitly, but it cannot solve the problem completely. There should be many race conditions in other places. I implement raw_local_irq_disable() and raw_local_irq_enable() with a lock, but there is still a serious problem. Because of spin_lock_irqsave, there can be dead locks. One example is in pcnet32_watchdog() and pcnet32_interrupt(). So I think we can just let spin_lock_irqsave() become spin_lock(). After all, the interrupt handling doesn't run in the real interrupt context any more, so I think there is no reason to do spin_lock_irqsave() or spin_lock_irq().
Best regards, Zheng Da