Hi there,
I noticed that the Fiasco kernel hangs under some circumstances so the whole system stops working. Unfortunately I wasn't even able to enter the kernel debugger via a serial cable in this case. The error is reproducible on different plattforms (e.g. Thinkpad T43, 1GB RAM, Pentium M 1.8 GHz or Thinkpad T41) even though it only appears from time to time (probability ~50% or even better). Within VMWare or Qemu this deadlock does *not* appear. Furthermore this error does not occur if the kernel was started using the "-esc" option. I wasn't able to reproduce this bug on a HP nx7400 (Dual Core T2300, 1GB RAM). We use an own dynamic loader to load some L4 tasks whereas other L4 tasks, as the mGUI, are loaded by GRUB.
The bug can be reproduced on real hardware using the iso-image which I will upload to the SLOX server (deadlock.iso). You have to choose GRUB entry two or three. If the error occurs the system hangs in text mode otherwise a graphical mode is entered. Sometimes you have to reboot up to 20 times but it usually occurs immediately.
Kind regards Daniel
Hi Daniel,
On Tuesday 06 March 2007, Daniel Vandersee wrote:
I noticed that the Fiasco kernel hangs under some circumstances so the whole system stops working. Unfortunately I wasn't even able to enter the kernel debugger via a serial cable in this case. The error is reproducible on different plattforms (e.g. Thinkpad T43, 1GB RAM, Pentium M 1.8 GHz or Thinkpad T41) even though it only appears from time to time (probability ~50% or even better). Within VMWare or Qemu this deadlock does *not* appear. Furthermore this error does not occur if the kernel was started using the "-esc" option. I wasn't able to reproduce this bug on a HP nx7400 (Dual Core T2300, 1GB RAM). We use an own dynamic loader to load some L4 tasks whereas other L4 tasks, as the mGUI, are loaded by GRUB.
The bug can be reproduced on real hardware using the iso-image which I will upload to the SLOX server (deadlock.iso). You have to choose GRUB entry two or three. If the error occurs the system hangs in text mode otherwise a graphical mode is entered. Sometimes you have to reboot up to 20 times but it usually occurs immediately.
To debug such issues the builtin watchdog is very helpful. Make sure that CONFIG_WATCHDOG is enabled in the kernel configuration and pass the -watchdog parameter at the kernel command line. A builtin Local APIC is required. If the kernel deadlocks for some reason and the interrupts are disabled, the watchdog will force the CPU to enter the Fiasco kernel debugger.
Kind regards,
Frank
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Hi Daniel,
I noticed that the Fiasco kernel hangs under some circumstances so the whole system stops working. Unfortunately I wasn't even able to enter the kernel debugger via a serial cable in this case. The error is reproducible on different plattforms (e.g. Thinkpad T43, 1GB RAM, Pentium M 1.8 GHz or Thinkpad T41) even though it only appears from time to time (probability ~50% or even better). Within VMWare or Qemu this deadlock does *not* appear. Furthermore this error does not occur if the kernel was started using the "-esc" option. I wasn't able to reproduce this bug on a HP nx7400 (Dual Core T2300, 1GB RAM). We use an own dynamic loader to load some L4 tasks whereas other L4 tasks, as the mGUI, are loaded by GRUB.
The bug can be reproduced on real hardware using the iso-image which I will upload to the SLOX server (deadlock.iso). You have to choose GRUB entry two or three. If the error occurs the system hangs in text mode otherwise a graphical mode is entered. Sometimes you have to reboot up to 20 times but it usually occurs immediately.
After some (~ 10) tries I can ackowledge your problem. However I'm afraid we won't be able to help you without further information.
Please find attached our bugreport form and fill in the missing information.
Note, that we do not have any access to some of the applications in your scenario, namely - - randservice - - storage - - hddenc - - pmngr - - mgui - - compmgr - - compmgrclientl4 - - maybe loader, if this is not identical to the loader which is part of the L4Env.
To get help, you should either * reproduce the problem with components coming only from the OTC snapshot used for your scenario, or * provide us with source code of the participating applications so that we are able to debug all parts of the system.
Regards, Bjoern
To: l4-hackers@os.inf.tu-dresden.de Subject: [BUG] <descriptive, recognizable, memorable name for the bug report>
1. General information
* What happens? * When does it happen? * Can it be reproduced?
2. Environment
* used hardware for Fiasco native * CPU, RAM, virtualization software * Devices, if involved * Host configuration when using Fiasco-UX * Host Linux kernel * Linux distribution used * used software * Fiasco version * L4Env version (which OTC Snapshot, which CVS tag, which patchlevel) * used tools (compilers, utilities, ...)
3. Howto reproduce the bug
* GRUB menu.lst entry if booting native OR Fiasco-UX command line * Simple step-by-step description what to do if the issue needs interaction
4. Logs (as email attachment or web links, please)
* A complete kernel debugger log from your terminal program * L4Linux syslog if involved
5. Configuration (as email attachment or web links, please)
* Fiasco configuration (fiasco-build-dir/globalconfig.out) * L4Env configuration (l4-build-dir/Makeconf.bid.local) * L4Linux configuration if involved (l4lx-build-dir/.config) * Loader startup scripts if involved
6. Further information
* Please provide all the patches you applied to our source code. * Please provide the source code of the applications involved (if this source code cannot already be obtained from another source - in this case you might want to point us at this source.)
We would highly appreciate a frozen VMWare image of your bug. If this is impossible, unstripped binaries (containing debug information, e.g., "gcc -g") of the programs you used might be necessary. Very good bug reports include a minimum test case to reproduce the issue or even contain a patch that fixes it.
If you expect a proper bugfix or prompt response items 1 to 5 are mandatory and must be complete!
l4-hackers@os.inf.tu-dresden.de