Fiasco.OC: null-pointer dereference?

Stefan Kalkowski stefan.kalkowski at genode-labs.com
Tue May 9 08:56:48 CEST 2017


Hi Matthias,

what a cool remote diagnosis. Indeed, the patch solved the _quota
initialization problem.

Thanks & best regards
Stefan

On 05/08/2017 09:31 PM, Matthias Lange wrote:
> Hi Stefan,
> 
> On 05/08/2017 03:36 PM, Matthias Lange wrote:
>> Hi Stefan,
>>
>> On 05/08/2017 09:09 AM, Stefan Kalkowski wrote:
>>> Dear L4-Hackers,
>>>
>>> recently, I started to upgrade the Fiasco.OC kernel version that is used
>>> by the Genode OS framework to the lastly released version (r72). I took
>>> the opportunity to upgrade, because the upcoming Genode release uses a
>>> fresh compiler toolchain that refused to build the very old Fiasco.OC
>>> kernel version that was used until now (r56).
>>> Everything went quite smoothly, and I'm glad to see how the kernel
>>> develops further. Thanks to all developers at this point!
>>>
>>> Unfortunately, I stumbled across an issue when it comes to thread
>>> destruction. In our system all threads are constructed and destructed by
>>> the roottask that is called 'core'. In some cases, not always but quite
>>> often, the Ram_quota pointer of the thread object is zero during the
>>> call of the Thread_oject's delete operator, which leads to a page-fault
>>> within the kernel-code. A simple check[1] before dereferencing the
>>> pointer solves the problem, but I wonder whether we will leak quota or
>>> memory then, or in general cover some more serious problem.
>>
>> Thank you for reporting this issue. I will forward this to our kernel
>> maintainer.
>>
>> Could you elaborate a little bit more on the circumstances leading to
>> this issue? I wonder whether we can come up with a simple test case
>> triggering the page fault.
> 
> No need to come up with a test case. It turns out that your problem
> originates in an unfortunate combination of "old" sources and new
> toolchain. C++ allows the compiler to elide writes to objects that are
> later intialized by a constructor which leads to the _quota member not
> being initialized correctly under all circumstances.
> 
> That also answers your inital question that, yes, your check covers a
> more serious problem :).
> 
> Could you please try the attached patch? It should fix the problem.
> 
> Best,
> Matthias.
> 
>>
>> Best,
>> Matthias.
>>
>>> Obviously, we have different usage patterns of syscalls, e.g.: the order
>>> of destructing IPC-gates, threads, IRQs, and tasks. Moreover, we still
>>> have some very few patches[2] so that the kernel meets our requirements.
>>> But none of them explains the thread's Ram_quota pointer getting zero.
>>> The page-fault triggers across all x86 and arm platforms that we use.
>>>
>>> Any hint would be very much appreciated, all the best!
>>> Stefan
>>>
>>> [1]
>>> https://github.com/skalk/foc/commit/2b01c9d16fd8e29e6af18fe750be2c8a312b4762
>>> [2] https://github.com/skalk/foc/commits/r72

-- 
Stefan Kalkowski
Genode Labs

https://github.com/skalk · http://genode.org/



More information about the l4-hackers mailing list