On Fri, 2007-08-03 at 09:53 +0200, Stefan Scheler wrote:
Hello,
I implemented a statistical profiler for fiasco using performance monitoring counters (nmi handler is attached to handle_slow_trap()). It works fairly well for "mathematical" workloads but as soon as interrupt activity gets involved to a greater extent, the system freezes. For example, keeping a button pressed while sampling will freeze the machine within a few seconds. The NMI watchdog does not trigger during that freeze. Higher sampling rates also cause these freezes.
So imho there seems to be a problem when irq_interrupt() or Irq::hit() is interrupted by an NMI. What do you think? Any hints on how to fix this or on how to debug this will be greatly appreciated.
Thanks in advance. I will happily provide further information if needed.
I do'nt think tat there is a problem with an NMI in Fiasco's IRQ routines, except that there may be stack overruns.
I'd suggest to use a task gate for your NMI and run it on a completely different stack. Because there are parts of code that are extremely sensible to NMIs and this is basically the sysenter path of the Fiasco kernel.