edmundo@rano.demon.co.uk writes:
It dies after saying "irq still active". (To get that message I'm using the patch that Michael Hohmuth posted to the list on July 25.) I think I may have found a bug in the SANITY code in linux22/arch/l4-i386/kernel/irq.c, by the way. See below.
I remember, I've ever seen same message & following crash, thanks for the patch.
Unfortunately I still don't understand why the irq is "still active".
The problem is the following: Normally the pic only forwards one interrupt to the CPU and signals the next one after the current interrupt has been acknowledged. L4 uses a special PIC mode ("special fully nested mode" or something similar), that allows the pic to signal more than one interrupt to the CPU. It uses priorities to decide whether a newly arrived interrupt should be forwarded to the CPU. So it is possible that more than one interrupt is delivered to the CPU before the interrupt handler was able to acknowledge its own interrupt. If we use "unspecific end of interrupt" to acknowledge the interrupt we acknowledge the one with the highest priority.
Here's the corresponding bit of code in irq.c. (The lines which aren't indented as much as they should be came from Michael's debugging patch.)
mask = 1 << (irq & 7); if (irq < 8) { outb(inb(0x21) | mask, 0x21); /* block the irq */ outb(0x20, 0x20); /* acknowledge the irq */
We use unspecific end of interrupt to acknowledge the irq.
outb(0x0B, 0x20); if (inb(0x20) & mask) enter_kdebug("irq still active");
Now we read the "in service register" of the pic and find out that our interrupt is still active. Due to fiasco being preemptive and a wrong priority assignment to our interrupt threads it happened that a thread with a higher L4 priority which handled an irq with a lower pic priority accidently acknowledged the wrong interrupt.
If you look at the bottom of kernel/fiasco/src/irq.h you'll find an unused function irq_ack() which does roughly the same thing, because acknowleding the IRQ is something that ought to be done by L4/Fiasco, but is at present done by Linux.
So why do we get "irq still active" during heavy use of the network card? Is there a PIC expert in the house?
I hope the explanation above was clear enough :)
Is it possible that the problem is caused by the interrupt not being acknowledged quickly enough? If so, maybe I should move the ack from Linux into Fiasco ...
There was a long discussion about that and it looks like irq acknowledge will move to the micro kernel.
Jean