Fiasco.OC: null-pointer dereference?
matthias.lange at kernkonzept.com
Mon May 8 21:31:32 CEST 2017
On 05/08/2017 03:36 PM, Matthias Lange wrote:
> Hi Stefan,
> On 05/08/2017 09:09 AM, Stefan Kalkowski wrote:
>> Dear L4-Hackers,
>> recently, I started to upgrade the Fiasco.OC kernel version that is used
>> by the Genode OS framework to the lastly released version (r72). I took
>> the opportunity to upgrade, because the upcoming Genode release uses a
>> fresh compiler toolchain that refused to build the very old Fiasco.OC
>> kernel version that was used until now (r56).
>> Everything went quite smoothly, and I'm glad to see how the kernel
>> develops further. Thanks to all developers at this point!
>> Unfortunately, I stumbled across an issue when it comes to thread
>> destruction. In our system all threads are constructed and destructed by
>> the roottask that is called 'core'. In some cases, not always but quite
>> often, the Ram_quota pointer of the thread object is zero during the
>> call of the Thread_oject's delete operator, which leads to a page-fault
>> within the kernel-code. A simple check before dereferencing the
>> pointer solves the problem, but I wonder whether we will leak quota or
>> memory then, or in general cover some more serious problem.
> Thank you for reporting this issue. I will forward this to our kernel
> Could you elaborate a little bit more on the circumstances leading to
> this issue? I wonder whether we can come up with a simple test case
> triggering the page fault.
No need to come up with a test case. It turns out that your problem
originates in an unfortunate combination of "old" sources and new
toolchain. C++ allows the compiler to elide writes to objects that are
later intialized by a constructor which leads to the _quota member not
being initialized correctly under all circumstances.
That also answers your inital question that, yes, your check covers a
more serious problem :).
Could you please try the attached patch? It should fix the problem.
>> Obviously, we have different usage patterns of syscalls, e.g.: the order
>> of destructing IPC-gates, threads, IRQs, and tasks. Moreover, we still
>> have some very few patches so that the kernel meets our requirements.
>> But none of them explains the thread's Ram_quota pointer getting zero.
>> The page-fault triggers across all x86 and arm platforms that we use.
>> Any hint would be very much appreciated, all the best!
>>  https://github.com/skalk/foc/commits/r72
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 11615 bytes
Desc: not available
More information about the l4-hackers