Interrupt problems causing crash of network driver and Kernel panic?

Andreas Speier speier at internet-sicherheit.de
Thu Jan 28 14:33:43 CET 2010


Hi L4-Hackers,

I got problem running L4Linux 2.6.29 on Fiasco with a time critical
application (Asterisk).

The main hardware components are an embedded board with 1GHz VIA C7
processor, 1GB of RAM and 2 network interface cards (VIA-Rhine, 3Com).
The drivers are compiled into the L4Linux kernel.

Running this application works fine until some more interrupt intensive
process will be started. Then the VoIP communication (SIP) breaks down
and I got the following output in /var/log/syslog:

#################
Jan 28 08:24:58 TESTPC kernel: ------------[ cut here ]------------
Jan 28 08:24:58 TESTPC kernel: kernel BUG at
/home/[...]/l4linux-2.6.29/net/core/dev.c:2625!
Jan 28 08:24:58 TESTPC kernel: Trap: 6: 0000 [#1]
Jan 28 08:24:58 TESTPC kernel: last sysfs file:
Jan 28 08:24:58 TESTPC kernel: Modules linked in:
Jan 28 08:24:58 TESTPC kernel:
Jan 28 08:24:58 TESTPC kernel: Pid: 872, comm: find Not tainted
(2.6.29-l4 #3)
Jan 28 08:24:58 TESTPC kernel: EIP: ff04:[<0063c5bf>] EFLAGS: 00010246
CPU: 0
Jan 28 08:24:58 TESTPC kernel: EIP is at __napi_complete+0x2f/0x40
Jan 28 08:24:58 TESTPC kernel: EAX: 0092d45c EBX: 0092d45c ECX: 0092d45c
EDX: 008aa40c
Jan 28 08:24:58 TESTPC kernel: ESI: 00000001 EDI: 0092d45c EBP: b0afff14
ESP: b0afff0c
Jan 28 08:24:58 TESTPC kernel: DS: 4000 ES: 7032 FS: 0023 GS: 0043 SS: 0023
Jan 28 08:24:58 TESTPC kernel: Process find (pid: 872, ti=b0afe000
task=0a5391b0 task.ti=0a564000)
Jan 28 08:24:58 TESTPC kernel: Stack:
Jan 28 08:24:58 TESTPC kernel: b0afff14 eacff011 b0afff24 0063dd25
0092d444 00000001 b0afff40 00640473
Jan 28 08:24:58 TESTPC kernel: 00000040 000107b2 00000040 00000000
0092d45c b0afff5c 006404ca 000107b4
Jan 28 08:24:58 TESTPC kernel: 0000012c 00000001 0000000c 00000100
b0afff74 0041afe7 0000000a 00000000
Jan 28 08:24:58 TESTPC kernel: Call Trace:
Jan 28 08:24:58 TESTPC kernel: [<0063dd25>] ? napi_complete+0x25/0x40
Jan 28 08:24:58 TESTPC kernel: [<00640473>] ? process_backlog+0x93/0xa0
Jan 28 08:24:58 TESTPC kernel: [<006404ca>] ? net_rx_action+0x4a/0x100
Jan 28 08:24:58 TESTPC kernel: [<0041afe7>] ? __do_softirq+0x67/0x100
Jan 28 08:24:58 TESTPC kernel: [<0040bb05>] ? do_softirq+0x55/0x60
Jan 28 08:24:59 TESTPC kernel: [<0041af15>] ? irq_exit+0x35/0x40
Jan 28 08:24:59 TESTPC kernel: [<0040ba75>] ? do_IRQ+0x35/0x70
Jan 28 08:24:59 TESTPC kernel: [<00523ad5>] ? irq_dev_thread+0xf5/0x190
Jan 28 08:24:59 TESTPC kernel: Code: e5 f6 40 08 01 74 24 8b 40 20 85 c0
75 21 8b 11 8b 41 04 89 42 04 89 10 c7 41 04 00 02 20 00 c7 01 00 01 10
00 80 61 08 fe 5d c3 <0f> 0b eb fe 0f 0b eb fe 89 f6 8d bc 27 00 00 00
00 55 ba d0 00
Jan 28 08:24:59 TESTPC kernel: EIP: [<0063c5bf>]
__napi_complete+0x2f/0x40 SS:ESP 0023:b0afff0c
Jan 28 08:24:59 TESTPC kernel: ---[ end trace 5cee9576d4b98201 ]---
Jan 28 08:24:59 TESTPC kernel: Kernel panic - not syncing: Fatal
exception in interrupt
Jan 28 08:24:59 TESTPC kernel: panic: going to sleep forever, bye
Jan 28 08:25:07 TESTPC kernel: ------------[ cut here ]------------
Jan 28 08:25:07 TESTPC kernel: WARNING: at
/home/[...]/l4linux-2.6.29/net/sched/sch_generic.c:226
dev_watchdog+0x18e/0x1a0()
Jan 28 08:25:07 TESTPC kernel: NETDEV WATCHDOG: eth0 (3c59x): transmit
timed out
Jan 28 08:25:07 TESTPC kernel: Modules linked in:
Jan 28 08:25:07 TESTPC kernel: Pid: 0, comm: swapper Tainted: G     
D    2.6.29-l4 #3
Jan 28 08:25:07 TESTPC kernel: Call Trace:
Jan 28 08:25:07 TESTPC kernel: [<00416bc6>] warn_slowpath+0x76/0x90
Jan 28 08:25:07 TESTPC kernel: [<006a0030>] ?
xfrm4_mode_tunnel_output+0x50/0xf0
Jan 28 08:25:07 TESTPC kernel: [<004114fe>] ? __enqueue_entity+0x8e/0xb0
Jan 28 08:25:07 TESTPC kernel: [<0041156e>] ? enqueue_entity+0x4e/0x90
Jan 28 08:25:07 TESTPC kernel: [<00411500>] ? __enqueue_entity+0x90/0xb0
Jan 28 08:25:07 TESTPC kernel: [<0040242c>] ?
l4x_global_restore_flags+0xc/0x40
Jan 28 08:25:07 TESTPC kernel: [<0041167f>] ? try_to_wake_up+0x8f/0xc0
Jan 28 08:25:07 TESTPC kernel: [<004116bb>] ? default_wake_function+0xb/0x10
Jan 28 08:25:07 TESTPC kernel: [<004273d1>] ?
autoremove_wake_function+0x11/0x40
Jan 28 08:25:07 TESTPC kernel: [<0051f537>] ? strlcpy+0x17/0x50
Jan 28 08:25:07 TESTPC kernel: [<0064c1be>] dev_watchdog+0x18e/0x1a0
Jan 28 08:25:07 TESTPC kernel: [<00410078>] ? requeue_task_rt+0x18/0x90
Jan 28 08:25:07 TESTPC kernel: [<00402455>] ?
l4x_global_restore_flags+0x35/0x40
Jan 28 08:25:07 TESTPC kernel: [<00424828>] ? __queue_work+0x38/0x40
Jan 28 08:25:07 TESTPC kernel: [<0041e50d>] run_timer_softirq+0x11d/0x170
Jan 28 08:25:07 TESTPC kernel: [<0064c030>] ? dev_watchdog+0x0/0x1a0
Jan 28 08:25:07 TESTPC kernel: [<0041afe7>] __do_softirq+0x67/0x100
Jan 28 08:25:07 TESTPC kernel: [<0040bb05>] do_softirq+0x55/0x60
Jan 28 08:25:07 TESTPC kernel: [<0041af15>] irq_exit+0x35/0x40
Jan 28 08:25:07 TESTPC kernel: [<0040ba75>] do_IRQ+0x35/0x70
Jan 28 08:25:07 TESTPC kernel: [<00523ef3>] timer_irq_thread+0x123/0x1a0
Jan 28 08:25:07 TESTPC kernel: ---[ end trace 5cee9576d4b98202 ]---
Jan 28 08:25:07 TESTPC kernel: eth0: transmit timed out, tx_status 00
status e681.
Jan 28 08:25:07 TESTPC kernel: diagnostics: net 0cd2 media 8880 dma
0000003a fifo 8000
Jan 28 08:25:07 TESTPC kernel: eth0: Interrupt posted but not delivered
-- IRQ blocked by another device?
Jan 28 08:25:07 TESTPC kernel: Flags; bus-master 1, dirty 612(4) current
612(4)
Jan 28 08:25:07 TESTPC kernel: Transmit list 00000000 vs. 0ae83480.
Jan 28 08:25:07 TESTPC kernel: 0: @0ae83200  length 80000057 status 0c010057
Jan 28 08:25:07 TESTPC kernel: 1: @0ae832a0  length 80000057 status 0c010057
Jan 28 08:25:07 TESTPC kernel: 2: @0ae83340  length 80000057 status 8c010057
Jan 28 08:25:07 TESTPC kernel: 3: @0ae833e0  length 80000057 status 8c010057
Jan 28 08:25:07 TESTPC kernel: 4: @0ae83480  length 80000057 status 0c010057
Jan 28 08:25:07 TESTPC kernel: 5: @0ae83520  length 80000057 status 0c010057
Jan 28 08:25:07 TESTPC kernel: 6: @0ae835c0  length 80000057 status 0c010057
Jan 28 08:25:07 TESTPC kernel: 7: @0ae83660  length 80000057 status 0c010057
Jan 28 08:25:07 TESTPC kernel: 8: @0ae83700  length 80000057 status 0c010057
Jan 28 08:25:07 TESTPC kernel: 9: @0ae837a0  length 80000057 status 0c010057
Jan 28 08:25:07 TESTPC kernel: 10: @0ae83840  length 80000057 status
0c010057
Jan 28 08:25:07 TESTPC kernel: 11: @0ae838e0  length 80000057 status
0c010057
Jan 28 08:25:07 TESTPC kernel: 12: @0ae83980  length 80000057 status
0c010057
Jan 28 08:25:07 TESTPC kernel: 13: @0ae83a20  length 80000057 status
0c010057
Jan 28 08:25:07 TESTPC kernel: 14: @0ae83ac0  length 80000057 status
0c010057
Jan 28 08:25:07 TESTPC kernel: 15: @0ae83b60  length 80000057 status
0c010057
Jan 28 08:25:07 TESTPC kernel: eth0: Resetting the Tx ring pointer.
Jan 28 08:25:17 TESTPC kernel: eth0: transmit timed out, tx_status 00
status e681.
Jan 28 08:25:17 TESTPC kernel: diagnostics: net 0cd2 media 8880 dma
0000003a fifo 8000
Jan 28 08:25:17 TESTPC kernel: eth0: Interrupt posted but not delivered
-- IRQ blocked by another device?
Jan 28 08:25:17 TESTPC kernel: Flags; bus-master 1, dirty 628(4) current
628(4)
Jan 28 08:25:17 TESTPC kernel: Transmit list 00000000 vs. 0ae83480.
Jan 28 08:25:17 TESTPC kernel: 0: @0ae83200  length 80000057 status 0c010057
Jan 28 08:25:17 TESTPC kernel: 1: @0ae832a0  length 80000057 status 0c010057
Jan 28 08:25:17 TESTPC kernel: 2: @0ae83340  length 80000057 status 8c010057
Jan 28 08:25:17 TESTPC kernel: 3: @0ae833e0  length 80000057 status 8c010057
Jan 28 08:25:17 TESTPC kernel: 4: @0ae83480  length 80000057 status 0c010057
Jan 28 08:25:17 TESTPC kernel: 5: @0ae83520  length 80000057 status 0c010057
Jan 28 08:25:17 TESTPC kernel: 6: @0ae835c0  length 80000057 status 0c010057
Jan 28 08:25:17 TESTPC kernel: 7: @0ae83660  length 80000057 status 0c010057
Jan 28 08:25:17 TESTPC kernel: 8: @0ae83700  length 80000057 status 0c010057
Jan 28 08:25:17 TESTPC kernel: 9: @0ae837a0  length 80000057 status 0c010057
Jan 28 08:25:17 TESTPC kernel: 10: @0ae83840  length 80000057 status
0c010057
Jan 28 08:25:17 TESTPC kernel: 11: @0ae838e0  length 80000057 status
0c010057
Jan 28 08:25:17 TESTPC kernel: 12: @0ae83980  length 80000057 status
0c010057
Jan 28 08:25:17 TESTPC kernel: 13: @0ae83a20  length 80000057 status
0c010057
Jan 28 08:25:17 TESTPC kernel: 14: @0ae83ac0  length 80000057 status
0c010057
Jan 28 08:25:17 TESTPC kernel: 15: @0ae83b60  length 80000057 status
0c010057
Jan 28 08:25:17 TESTPC kernel: eth0: Resetting the Tx ring pointer.
Jan 28 08:25:27 TESTPC kernel: eth0: transmit timed out, tx_status 00
status e681.
Jan 28 08:25:27 TESTPC kernel: diagnostics: net 0cd2 media 8880 dma
0000003a fifo 8000
Jan 28 08:25:27 TESTPC kernel: eth0: Interrupt posted but not delivered
-- IRQ blocked by another device?
Jan 28 08:25:27 TESTPC kernel: Flags; bus-master 1, dirty 644(4) current
644(4)
Jan 28 08:25:27 TESTPC kernel: Transmit list 00000000 vs. 0ae83480.
Jan 28 08:25:27 TESTPC kernel: 0: @0ae83200  length 80000057 status 0c010057
Jan 28 08:25:27 TESTPC kernel: 1: @0ae832a0  length 80000057 status 0c010057
Jan 28 08:25:27 TESTPC kernel: 2: @0ae83340  length 80000057 status 8c010057
Jan 28 08:25:27 TESTPC kernel: 3: @0ae833e0  length 80000057 status 8c010057
Jan 28 08:25:27 TESTPC kernel: 4: @0ae83480  length 80000057 status 0c010057
Jan 28 08:25:27 TESTPC kernel: 5: @0ae83520  length 80000057 status 0c010057
Jan 28 08:25:27 TESTPC kernel: 6: @0ae835c0  length 80000057 status 0c010057
Jan 28 08:25:27 TESTPC kernel: 7: @0ae83660  length 80000057 status 0c010057
Jan 28 08:25:27 TESTPC kernel: 8: @0ae83700  length 80000057 status 0c010057
Jan 28 08:25:27 TESTPC kernel: 9: @0ae837a0  length 80000057 status 0c010057
Jan 28 08:25:27 TESTPC kernel: 10: @0ae83840  length 80000057 status
0c010057
Jan 28 08:25:27 TESTPC kernel: 11: @0ae838e0  length 80000057 status
0c010057
Jan 28 08:25:27 TESTPC kernel: 12: @0ae83980  length 80000057 status
0c010057
Jan 28 08:25:27 TESTPC kernel: 13: @0ae83a20  length 80000057 status
0c010057
Jan 28 08:25:27 TESTPC kernel: 14: @0ae83ac0  length 80000057 status
0c010057
Jan 28 08:25:27 TESTPC kernel: 15: @0ae83b60  length 80000057 status
0c010057
Jan 28 08:25:27 TESTPC kernel: eth0: Resetting the Tx ring pointer.
Jan 28 08:25:37 TESTPC kernel: eth0: transmit timed out, tx_status 00
status e681.
Jan 28 08:25:37 TESTPC kernel: diagnostics: net 0cd2 media 8880 dma
0000003a fifo 8000
Jan 28 08:25:37 TESTPC kernel: eth0: Interrupt posted but not delivered
-- IRQ blocked by another device?
Jan 28 08:25:37 TESTPC kernel: Flags; bus-master 1, dirty 660(4) current
660(4)
Jan 28 08:25:37 TESTPC kernel: Transmit list 00000000 vs. 0ae83480.
Jan 28 08:25:37 TESTPC kernel: 0: @0ae83200  length 80000057 status 0c010057
Jan 28 08:25:37 TESTPC kernel: 1: @0ae832a0  length 80000057 status 0c010057
Jan 28 08:25:37 TESTPC kernel: 2: @0ae83340  length 80000057 status 8c010057
Jan 28 08:25:37 TESTPC kernel: 3: @0ae833e0  length 80000057 status 8c010057
Jan 28 08:25:37 TESTPC kernel: 4: @0ae83480  length 80000057 status 0c010057
Jan 28 08:25:37 TESTPC kernel: 5: @0ae83520  length 80000057 status 0c010057
Jan 28 08:25:37 TESTPC kernel: 6: @0ae835c0  length 80000057 status 0c010057
Jan 28 08:25:37 TESTPC kernel: 7: @0ae83660  length 80000057 status 0c010057
Jan 28 08:25:37 TESTPC kernel: 8: @0ae83700  length 80000057 status 0c010057
Jan 28 08:25:37 TESTPC kernel: 9: @0ae837a0  length 80000057 status 0c010057
Jan 28 08:25:37 TESTPC kernel: 10: @0ae83840  length 80000057 status
0c010057
Jan 28 08:25:37 TESTPC kernel: 11: @0ae838e0  length 80000057 status
0c010057
Jan 28 08:25:37 TESTPC kernel: 12: @0ae83980  length 80000057 status
0c010057
Jan 28 08:25:37 TESTPC kernel: 13: @0ae83a20  length 80000057 status
0c010057
Jan 28 08:25:37 TESTPC kernel: 14: @0ae83ac0  length 80000057 status
0c010057
Jan 28 08:25:37 TESTPC kernel: 15: @0ae83b60  length 80000057 status
0c010057
Jan 28 08:25:37 TESTPC kernel: eth0: Resetting the Tx ring pointer.
Jan 28 08:25:47 TESTPC kernel: eth0: transmit timed out, tx_status 00
status e681.
Jan 28 08:25:47 TESTPC kernel: diagnostics: net 0cd2 media 8880 dma
0000003a fifo 8000
Jan 28 08:25:47 TESTPC kernel: eth0: Interrupt posted but not delivered
-- IRQ blocked by another device?
Jan 28 08:25:47 TESTPC kernel: Flags; bus-master 1, dirty 676(4) current
676(4)
Jan 28 08:25:47 TESTPC kernel: Transmit list 00000000 vs. 0ae83480.
Jan 28 08:25:47 TESTPC kernel: 0: @0ae83200  length 80000057 status 0c010057
Jan 28 08:25:47 TESTPC kernel: 1: @0ae832a0  length 80000057 status 0c010057
Jan 28 08:25:47 TESTPC kernel: 2: @0ae83340  length 80000057 status 8c010057
Jan 28 08:25:47 TESTPC kernel: 3: @0ae833e0  length 80000057 status 8c010057
Jan 28 08:25:47 TESTPC kernel: 4: @0ae83480  length 80000057 status 0c010057
Jan 28 08:25:47 TESTPC kernel: 5: @0ae83520  length 80000057 status 0c010057
Jan 28 08:25:47 TESTPC kernel: 6: @0ae835c0  length 80000057 status 0c010057
Jan 28 08:25:47 TESTPC kernel: 7: @0ae83660  length 80000057 status 0c010057
Jan 28 08:25:47 TESTPC kernel: 8: @0ae83700  length 80000057 status 0c010057
Jan 28 08:25:47 TESTPC kernel: 9: @0ae837a0  length 80000057 status 0c010057
Jan 28 08:25:47 TESTPC kernel: 10: @0ae83840  length 80000057 status
0c010057
Jan 28 08:25:47 TESTPC kernel: 11: @0ae838e0  length 80000057 status
0c010057
Jan 28 08:25:47 TESTPC kernel: 12: @0ae83980  length 80000057 status
0c010057
Jan 28 08:25:47 TESTPC kernel: 13: @0ae83a20  length 80000057 status
0c010057
Jan 28 08:25:47 TESTPC kernel: 14: @0ae83ac0  length 80000057 status
0c010057
Jan 28 08:25:47 TESTPC kernel: 15: @0ae83b60  length 80000057 status
0c010057
Jan 28 08:25:47 TESTPC kernel: eth0: Resetting the Tx ring pointer.
Jan 28 08:25:56 TESTPC kernel: eth0: transmit timed out, tx_status 00
status e681.
Jan 28 08:25:56 TESTPC kernel: diagnostics: net 0cd2 media 8880 dma
0000003a fifo 8000
Jan 28 08:25:56 TESTPC kernel: eth0: Interrupt posted but not delivered
-- IRQ blocked by another device?
Jan 28 08:25:56 TESTPC kernel: Flags; bus-master 1, dirty 692(4) current
692(4)
Jan 28 08:25:56 TESTPC kernel: Transmit list 00000000 vs. 0ae83480.
Jan 28 08:25:56 TESTPC kernel: 0: @0ae83200  length 80000057 status 0c010057
Jan 28 08:25:56 TESTPC kernel: 1: @0ae832a0  length 8000002a status 0001002a
Jan 28 08:25:56 TESTPC kernel: 2: @0ae83340  length 8000006a status 8c01006a
Jan 28 08:25:56 TESTPC kernel: 3: @0ae833e0  length 80000057 status 8c010057
Jan 28 08:25:56 TESTPC kernel: 4: @0ae83480  length 80000057 status 0c010057
Jan 28 08:25:56 TESTPC kernel: 5: @0ae83520  length 80000057 status 0c010057
Jan 28 08:25:56 TESTPC kernel: 6: @0ae835c0  length 80000057 status 0c010057
Jan 28 08:25:56 TESTPC kernel: 7: @0ae83660  length 80000057 status 0c010057
Jan 28 08:25:56 TESTPC kernel: 8: @0ae83700  length 80000057 status 0c010057
Jan 28 08:25:56 TESTPC kernel: 9: @0ae837a0  length 80000057 status 0c010057
Jan 28 08:25:56 TESTPC kernel: 10: @0ae83840  length 80000057 status
0c010057
Jan 28 08:25:56 TESTPC kernel: 11: @0ae838e0  length 80000057 status
0c010057
Jan 28 08:25:56 TESTPC kernel: 12: @0ae83980  length 80000057 status
0c010057
Jan 28 08:25:56 TESTPC kernel: 13: @0ae83a20  length 80000057 status
0c010057
Jan 28 08:25:56 TESTPC kernel: 14: @0ae83ac0  length 80000057 status
0c010057
Jan 28 08:25:56 TESTPC kernel: 15: @0ae83b60  length 80000057 status
0c010057
Jan 28 08:25:56 TESTPC kernel: eth0: Resetting the Tx ring pointer.
#################

I suppose this has something to do with interrupts that cannot be
processed in time. Unfortunately I could not find any solution to that.
I tested to rise the L4Linux priority, but this could not solve the
problem.

The configuration files look like the following:

menu.lst:
--------------------
title    L4LX-TEST
root    (hd0,0)
    kernel /test/tcb/bootstrap -modaddr 0x02000000
    module /test/tcb/fiasco -nokdb -serial_esc
    module /test/tcb/sigma0
    module /test/tcb/roottask -configfile task modname "loader"
allow_cli boot_priority 0x95 task modname "bmodfs" attached 4 modules
    module /test/tcb/roottask.config
    module /test/tcb/events
    module /test/tcb/names
    module /test/tcb/log
    module /test/tcb/dm_phys --isa=0x800000 -v -e
    module /test/tcb/simple_ts -t 300
    module /test/tcb/rtc -v
    module /test/tcb/l4io
    module /test/tcb/l4dope
    vbeset 0x117

    module /test/tcb/bmodfs
      module /test/tcb/libld-l4.s.so
      module /test/tcb/libloader.s.so
      module /test/l4lx/l4lx-net.cfg
      module /test/l4lx/l4lx-net

module /test/tcb/loader --fprov=BMODFS l4lx-net.cfg




roottask.config
-------------------------
#!roottask

task sigma0 boot_priority 0xA0
task roottask boot_priority 0xA0
task modname "rtc" allow_cli
task modname "names" boot_priority 0xA0
task modname "dm_phys" boot_priority 0xA0
task modname "simple_ts" boot_priority 0xA0
task modname "l4io" allow_cli boot_priority 0xA0
task modname "loader" allow_cli boot_priority 0xA0
task modname "l4dope" allow_cli boot_priority 0xA0
task modname "bmodfs" boot_priority 0xA0
task modname "events" boot_priority 0xA0
task modname "l4lx-net" boot_priority 0xA0


l4lx-net.cfg
------------------
sleep 0

task "l4lx-net" "mem=160M root=1:0 root=/dev/hda2"
  allow_cli
  priority 0xE0
  all_sects_writable




I hope that someone in this list does have any idea why this error
occurs, or some hint I can test to help solving it.
Any good idea is welcome!

Regards,
Andreas

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 260 bytes
Desc: OpenPGP digital signature
URL: <http://os.inf.tu-dresden.de/pipermail/l4-hackers/attachments/20100128/b9d6535b/attachment.asc>


More information about the l4-hackers mailing list