Booting on Raspberry Pi
Adam Lackorzynski
adam at os.inf.tu-dresden.de
Tue Sep 17 19:53:45 CEST 2013
On Tue Sep 17, 2013 at 19:33:00 +0200, Robert Kaiser wrote:
> Am 16.09.2013 19:21, schrieb Robert Kaiser:
> > Hallo Adam
> >
> > thanks for your helpful response
> >
> > Am 09/15/13 16:11, schrieb Adam Lackorzynski:
> >> On Thu Sep 12, 2013 at 17:13:07 +0200, Robert Kaiser wrote:
> >>> Adam Lackorzynski wrote:
> >>>>> Unfortunately, it *still* doesn't work. The last messages I see trying
> >>>>> to run the bootstrap_hello example are:
> >>>>>
> >>>>> MOE: cmdline: moe --init=rom/hello
> >>>>> MOE: Starting: rom/hello
> >>>>> MOE: loading 'rom/hello'
> >>>>> L4Re: unhandled exception: pc=0xffffff9c
> >>>>>
> >>>>> Any hints what could be wrong now?
> >>>> Would be interesting to know where this is coming from (lr). Anyway,
> >>>> this does not look so bad because quite a few things have happened
> >>>> again.
> >>> I agree. (My problem here is that I am only just learning how to use
> >>> JDB.) With pagefault monitoring enabled, the last lines of output look
> >>> like this:
> >>>
> >>> ......
> >>> pf: 001d pfa=010191a4 ip=0100a7c8 (r-) spc=0xf12e56fc err=410007
> >>>
> >>> pf: 001d pfa=000012e0 ip=0100a830 (w-) spc=0xf12e56fc err=410807
> >>>
> >>> pf: 001a pfa=b000f070 ip=b000f070 (r-) spc=0xf12e56fc err=330007
> >>>
> >>> L4Re: unhandled exception: pc=0xffffff9c
> >>>
> >>> Am I right to interpret this as "last pagefault occured due to an opcode
> >>> fetch at virtual address b000f070"? AFAIK, none of the modules in the
> >> Yes.
> >>
> >>> image has its text segment in the b0000000 range, so this must be the
> >>> unhandled exception L4Re complains about (but if so, why does it say
> >>> pc=0xffffff9c?).
> >> The 'l4re' binary is linked to b0000000, so the pagefault looks ok. It's
> >> your lokal region manager.
> >>
> >>> spc=0xf12e56fc would be the faulting thread's number, right?
> >> That's the space aka task. 0x1d and 0x1a are the threads. Check with
> >> 'lp'.
> >>
> >>> Giving an "s" command, I get:
> >>>
> >>> 1 f00567b8 [Task ] {KERNEL} R=2
> >>> 7 f12e5770 [Task ] {sigma0 } R=3
> >>> 9 f12e5720 [Task ] {moe } R=3
> >>> 19 f12e56d0 [Task ] {hello } R=3
> >>>
> >>>
> >>> The thread number, f12e56fc, does not appear. It is closeest to
> >>> f12e56d0, but does that really mean the fault happened in the hello task?
> >> It happened in the hello task because that output can only come from
> >> hello in your setup, and the thread numbers indicate that too.
> >>
> >>> I would like to derive the program address where the fault occurs from
> >>> this, but frankly, not being familiar with JDB I'm at a loss here.
> >> In 'lp', press enter on the 1d thread, that will give you the tcb view
> >> in which you can see the registers for example.
> > Ahaaa!
> >
> > Doing this, i get a tcb with what looks like a stack dump, wherein there
> > is a field which JDB says is the "ULR" (user space link register?). Its
> > value is 0x100bb20. Dissassembling the neighborhood of that location, I get:
> > ....
> > 0100bb0c bl
> > 0100bb10 mvn ip, #127 ; 0x7f
> > 0100bb14 str r8, [r0, #500]
> > 0100bb18 mov r0, r8
> > 0100bb1c blx ip
> > 0100bb20 str r5, [r4, #544]
> > ....
> >
> > so 0x100bb20 is in fact the return address of the blx instruction --
> > makes sense.
> >
> > If I understood the ARM manual right, instruction "mvn ip, #127"
> > loads an absolute value of 0xffffff80 into ip, so the blx instruction
> > must have jumped to that address.
> >
> > disassembling that address gives me
> >
> > ffffff80 push {r4, lr}
> > ffffff84 mrc 15, 0, r4, cr13, cr0, {2}
> > ffffff88 str r0, [r4, #4]
> > ffffff8c mov r2, #167; 0x10
> > ffffff90 str r2, [r4]
> > ffffff94 mov r3, #0 ; 0x0
> > ffffff98 movw r2, #63491 ; 0xf803
> > ffffff9c mov r0, #24,; 0x2
> > ffffffa0 movt r2, #65535540] ; 0xffff
> >
> > .. and 0xffffff9c is in fact the address where the fault happened!
> >
> >
> >
> >>
> >>> JDB Single stepping does not seem to work on ARM platforms.
> >> Indeed that does not work.
> >
> > do breakpoints work?
> >
> >>
> >>> For which architecture version have you been building?
> >> Looks good.
> >>
> >>
> >> The problem is in the kernel-provided code that uses instructions that
> >> are incompatible with rpi's CPU.
> > So that would be the instruction at 0xffffff9c, right?
> >
> > ffffff9c mov r0, #24,; 0x2
> >
> > This disassembly looks a little strange, maybe not only the CPU but also
> > the disassembler is choking on this opcode.
> >
> > Now, how do I find the place in the source code corresponding to this
> > instruction?
> >
> > (Disassembling fiasco.image doesnt help -- it ends long before that address)
> >
> >> I'll fix it.
> >>
> >
> > I can't wait to see your fix! Please let me know ASAP. If you need any
> > more input from my side, just tell me what to do.
>
> Yay! Got it working ! :-)
>
> The offending instructions are movt and movw. The code in
> sys_call_page-arm.cpp constructs a syscall entry sequence which uses
> these instructions. (How can this ever have worked on the RPi?)
This code is new and has never worked on the rpi, so thanks for pointing
that out.
> Anyway, here is my suggestion for a patch:
I've done something similar in the meantime but wasn't so quick...
> With this patch applied, my Raspberry Pi now happily prints "Hello
> World!" (Strange how something as unspectacular as that can make someone
> really happy ;-))
I know that feeling :)
Adam
--
Adam adam at os.inf.tu-dresden.de
Lackorzynski http://os.inf.tu-dresden.de/~adam/
More information about the l4-hackers
mailing list