Re: Booting on Raspberry Pi

17 Sep 2013


      On Tue Sep 17, 2013 at 19:33:00 +0200, Robert Kaiser wrote:
...
Am 16.09.2013 19:21, schrieb Robert Kaiser:
...
Hallo Adam
thanks for your helpful response
Am 09/15/13 16:11, schrieb Adam Lackorzynski:
...
On Thu Sep 12, 2013 at 17:13:07 +0200, Robert Kaiser wrote:
...
Adam Lackorzynski wrote:
...
...
Unfortunately, it *still* doesn't work. The last messages I see trying
to run the bootstrap_hello example are:
MOE: cmdline: moe --init=rom/hello
MOE: Starting: rom/hello
MOE: loading 'rom/hello'
L4Re: unhandled exception: pc=0xffffff9c
Any hints what could be wrong now?
Would be interesting to know where this is coming from (lr). Anyway,
this does not look so bad because quite a few things have happened
again.
I agree. (My problem here is that I am only just learning how to use
JDB.) With pagefault monitoring enabled, the last lines of output look
like this:
......
pf:  001d pfa=010191a4 ip=0100a7c8 (r-) spc=0xf12e56fc err=410007
pf:  001d pfa=000012e0 ip=0100a830 (w-) spc=0xf12e56fc err=410807
pf:  001a pfa=b000f070 ip=b000f070 (r-) spc=0xf12e56fc err=330007
L4Re: unhandled exception: pc=0xffffff9c
Am I right to interpret this as "last pagefault occured due to an opcode
fetch at virtual address b000f070"? AFAIK, none of the modules in the
Yes.
...
image has its text segment in the b0000000 range, so this must be the
unhandled exception L4Re complains about (but if so, why does it say
pc=0xffffff9c?).
The 'l4re' binary is linked to b0000000, so the pagefault looks ok. It's
your lokal region manager.
...
spc=0xf12e56fc would be the faulting thread's number, right?
That's the space aka task. 0x1d and 0x1a are the threads. Check with
'lp'.
...
Giving an "s" command, I get:
1 f00567b8 [Task   ] {KERNEL} R=2
       7 f12e5770 [Task   ] {sigma0          } R=3
       9 f12e5720 [Task   ] {moe             } R=3
      19 f12e56d0 [Task   ] {hello           } R=3
The thread number, f12e56fc, does not appear. It is closeest to
f12e56d0, but does that really mean the fault happened in the hello task?
It happened in the hello task because that output can only come from
hello in your setup, and the thread numbers indicate that too.
...
I would like to derive the program address where the fault occurs from
this, but frankly, not being familiar with JDB I'm at a loss here.
In 'lp', press enter on the 1d thread, that will give you the tcb view
in which you can see the registers for example.
Ahaaa!
Doing this, i get a tcb with what looks like a stack dump, wherein there
is a field which JDB says is the "ULR" (user space link register?). Its
value is 0x100bb20. Dissassembling the neighborhood of that location, I get:
....
0100bb0c     bl   
0100bb10     mvn    ip, #127    ; 0x7f
0100bb14     str    r8, [r0, #500]
0100bb18     mov    r0, r8
0100bb1c     blx    ip
0100bb20     str    r5, [r4, #544]
....
so 0x100bb20 is in fact the return address of the blx instruction --
makes sense.
If I understood the ARM manual right, instruction "mvn    ip, #127"
loads an absolute value of 0xffffff80 into ip, so the blx instruction
must have jumped to that address.
disassembling that address gives me
ffffff80     push       {r4, lr}
ffffff84     mrc    15, 0, r4, cr13, cr0, {2}
ffffff88     str    r0, [r4, #4]
ffffff8c     mov    r2, #167; 0x10
ffffff90     str    r2, [r4]
ffffff94     mov    r3, #0    ; 0x0
ffffff98     movw       r2, #63491    ; 0xf803
ffffff9c     mov    r0, #24,; 0x2
ffffffa0     movt       r2, #65535540]  ; 0xffff
.. and 0xffffff9c is in fact the address where the fault happened!
...
...
JDB Single stepping does not seem to work on ARM platforms.
Indeed that does not work.
do breakpoints work?
...
...
For which architecture version have you been building?
Looks good.
The problem is in the kernel-provided code that uses instructions that
are incompatible with rpi's CPU.
So that would be the instruction at 0xffffff9c, right?
ffffff9c     mov    r0, #24,; 0x2
This disassembly looks a little strange, maybe not only the CPU but also
the disassembler is choking on this opcode.
Now, how do I find the place in the source code corresponding to this
instruction?
(Disassembling fiasco.image doesnt help -- it ends long before that address)
...
I'll fix it.
I can't wait to see your fix! Please let me know ASAP. If you need any
more input from my side, just tell me what to do.
Yay! Got it working ! :-)
The offending instructions are movt and movw. The code in
sys_call_page-arm.cpp constructs a syscall entry sequence which uses
these instructions. (How can this ever have worked on the RPi?)
This code is new and has never worked on the rpi, so thanks for pointing
that out.
...
Anyway, here is my suggestion for a patch:
I've done something similar in the meantime but wasn't so quick...
...
With this patch applied, my Raspberry Pi now happily prints "Hello
World!" (Strange how something as unspectacular as that can make someone
really happy ;-))
I know that feeling :)


Adam
-- 
Adam                 adam@os.inf.tu-dresden.de
  Lackorzynski         http://os.inf.tu-dresden.de/~adam/