Fiasco.OC: null-pointer dereference?
lesliezhai at llvm.org.cn
Fri Jun 9 04:42:47 CEST 2017
Thanks for your help!
在 2017年06月09日 00:53, Jean Wolter 写道:
> On 08/06/17 04:18, Leslie Zhai wrote:
>> Hi Matthias,
>> Jean taught me about how to debug L4Re using jdb in qemu
>> http://os.inf.tu-dresden.de/pipermail/l4-hackers/2017/008038.html it
>> used a on purpose bug (null ptr deref) to crash Ned, then L4Re
>> thrown: unhandled write page fault at 0x0 pc=0x100398d, and addr2line
>> ... -e ned -a 100398d to indicate the root cause line.
>> But how to find out the root cause if unclear that which components
>> bring in the issue?
> I think there might be a misunderstanding. I only introduced the null
> pointer dereference to demonstrate how to do it using a known problem.
> You can apply exactly the same steps in a different situation.
I just want to express the ('on purpose' is misuse, sorry for my poor
English) debug patch is a demo to guide me how to debug with jdb in the
qemu :) you are my mentor teaching me patiently and carefully!
> But I would like to add something. You actually had all the
> information you needed:
> MOE: loading 'rom/ned'
> Ned says: Hi World!
>  0 pf: 0022 pfa=0000000000000018 ip=fffffffff0031ea9 (R-)
>  L4Re[rm]: unhandled read page fault at 0x18 pc=0x102e893
>  L4Re: unhandled exception: pc=0xfffffffff0031ea9 (pfa=18)
> L4Re: Global::l4re_aux->ldr_flags=0
> In  you see the message from the local pager, that is unable to
> find a valid region for the pagefault address and complains. It shows
> the 0x18 as pagefault address and an instruction pointer 0x102e893.
> The instruction pointer did not make any sense at that time. The local
> pager triggers an exception.
> In  you see the exception message. It shows the instruction pointer
> where the pagefault was actually raised: 0xfffffffff0031ea9. This is
> an address inside the kernel:
That is the key point! it is magic to me that 0xfffffffff0031ea9 is an
address inside the kernel, I need to deepinto Fiasco about address
> ~/build/tmp/l4re$ addr2line -p -i -e
> ../leslie/fiasco/build/fiasco.image -a fffffffff0031ea9
> (inlined by)
> fffffffff0031ea8: fa cli
> fffffffff0031ea9: 48 8b 47 18 mov 0x18(%rdi),%rax
> fffffffff0031ead: 48 03 77 10 add 0x10(%rdi),%rsi
> If it is not a kernel fault and you need to find out, which component
> is responsible (or need more information about the current state) you
> can press 'i' when line  appears. You enter the kernel debugger and
> can look at the current thread using t<enter>. The thread has an id,
> which you can lookup in the list of present threads (using 'lp'). Here
> it is thread 22:
> id cpu name pr sp wait to stack state
> 2e 0 ----- 2 1e - ( 920) rcv_wait
> 2b 0 ----- 10 1e - (1072) rcv_wait
> 22 0 ----- 2 1e (1776) ready
> 1f 0 #ned ff 1e - (1072)
> All threads shown here have the same address space and therefore the
> problem happened in the context of ned.
>> The same story is how to debug L4Linux?
>> please give me some advice, thanks a lot!
> Maybe you can add "-serial stdio" to your qemu options and provide the
> complete backtrace for the problem? It looks like a framebuffer issue,
> but there should be more information in the lines above ...
I will try instead of posting screenshots on Twitter, sorry for my posting!
Leslie Zhai - a LLVM hacker https://reviews.llvm.org/p/xiangzhai/
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the l4-hackers