Fiasco.OC: null-pointer dereference?
stefan.kalkowski at genode-labs.com
Tue May 9 08:56:48 CEST 2017
what a cool remote diagnosis. Indeed, the patch solved the _quota
Thanks & best regards
On 05/08/2017 09:31 PM, Matthias Lange wrote:
> Hi Stefan,
> On 05/08/2017 03:36 PM, Matthias Lange wrote:
>> Hi Stefan,
>> On 05/08/2017 09:09 AM, Stefan Kalkowski wrote:
>>> Dear L4-Hackers,
>>> recently, I started to upgrade the Fiasco.OC kernel version that is used
>>> by the Genode OS framework to the lastly released version (r72). I took
>>> the opportunity to upgrade, because the upcoming Genode release uses a
>>> fresh compiler toolchain that refused to build the very old Fiasco.OC
>>> kernel version that was used until now (r56).
>>> Everything went quite smoothly, and I'm glad to see how the kernel
>>> develops further. Thanks to all developers at this point!
>>> Unfortunately, I stumbled across an issue when it comes to thread
>>> destruction. In our system all threads are constructed and destructed by
>>> the roottask that is called 'core'. In some cases, not always but quite
>>> often, the Ram_quota pointer of the thread object is zero during the
>>> call of the Thread_oject's delete operator, which leads to a page-fault
>>> within the kernel-code. A simple check before dereferencing the
>>> pointer solves the problem, but I wonder whether we will leak quota or
>>> memory then, or in general cover some more serious problem.
>> Thank you for reporting this issue. I will forward this to our kernel
>> Could you elaborate a little bit more on the circumstances leading to
>> this issue? I wonder whether we can come up with a simple test case
>> triggering the page fault.
> No need to come up with a test case. It turns out that your problem
> originates in an unfortunate combination of "old" sources and new
> toolchain. C++ allows the compiler to elide writes to objects that are
> later intialized by a constructor which leads to the _quota member not
> being initialized correctly under all circumstances.
> That also answers your inital question that, yes, your check covers a
> more serious problem :).
> Could you please try the attached patch? It should fix the problem.
>>> Obviously, we have different usage patterns of syscalls, e.g.: the order
>>> of destructing IPC-gates, threads, IRQs, and tasks. Moreover, we still
>>> have some very few patches so that the kernel meets our requirements.
>>> But none of them explains the thread's Ram_quota pointer getting zero.
>>> The page-fault triggers across all x86 and arm platforms that we use.
>>> Any hint would be very much appreciated, all the best!
>>>  https://github.com/skalk/foc/commits/r72
https://github.com/skalk · http://genode.org/
More information about the l4-hackers