Fiasco.OC: sigma0 stucks in ipc-path

31 May 2012

      Dear kernel-hackers,

I think I came along a problem in the IPC-path of Fiasco.OC sometimes in
the past, but it was somehow hard to reproduce. But now I've an example
that quite reliable triggers the issue.

The symptom in the past, and in the concrete example was that our
roottask in Genode (called core) was requesting memory from sigma0,
either implicitly just by touching some memory-area, like a ROM-module
loaded by the bootloader, or explicitly by using the sigma0-protocol,
e.g. to request I/O memory for the framebuffer. After that the request
was received by sigma0, and also processed, but the answer never reached
the faulter/client in this case the core-pager thread. The typical
picture in the kernel-debugger then looks like the following:

  http://pastebin.com/cFvn1NV2

sigma0 is marked as ready, but it still stucks in the ipc-syscall. By
instrumenting the kernel I could break down the point where sigma0 never
returns. It is when trying to establishing the actual mapping in
'Thread::transfer_msg_items' in file src/kern/thread-ipc.cc. The spot is
the following:

```
  cpu_lock.clear();
  L4_error err = fpage_map(snd->space(), sfp,
  rcv->space(), L4_fpage(buf->d), item->b, &rl);
  cpu_lock.lock();
```

After the cpu lock is given away, sigma0 never aquires it back
successfully. It looks to me like a race-condition, nevertheless simply
letting the lock being locked doesn't solved the problem ;-). Maybe
given your insight knowledge, you're much faster in tracking the problem
down to its root?

I've to add, that the problem occurred on x86 as well as ARM, on
different QEMU versions, as well as real hardware. I also could
reproduce it with slightly older versions of Fiasco.OC, and in different
development stages of Genode. Nevertheless, it was never that reliable
to reproduce. The current example which reproduces it reliable at least
with my QEMU (qemu-kvm-0.14.0) version. You can find in form of an
ISO-image here:

  http://dl.dropbox.com/u/82567292/avplay.iso

You can try the iso-image like the following:

  qemu -no-kvm -m 256 -soundhw all -serial mon:stdio -cdrom avplay.iso

Or, if you've to build it on your own, here is my topic branch,
including the avplay run-script that triggers the problem:

  https://github.com/skalk/genode/tree/fiasco.oc-ipc-issue

To compile and run it yourself you've to do the following steps (after
installing the genode toolchain from http://genode.org/download/tool-chain):

  git clone git@github.com:skalk/genode.git
  cd genode
  git checkout -b issue origin/fiasco.oc-ipc-issue
  git clone git@github.com:genodelabs/linux_drivers.git
  make -C base-foc prepare
  make -C libports prepare PKG="libav libc sdl zlib"
  tool/create_builddir foc_x86_32 BUILD_DIR=build
  sed -i "/#REPOSITORIES.*libports/s/#//" build/etc/build.conf
  sed -i "/#REPOSITORIES.*linux_drivers/s/#//" build/etc/build.conf
  make -C build run/avplay

Thank you in advance & best regards!
Stefan

Stefan kalkowski

Adam Lackorzynski

Stefan kalkowski

Stefan kalkowski

Adam Lackorzynski

Christian Helmuth

Norman Feske

Adam Lackorzynski

tags

participants (4)