On Tue Sep 17, 2013 at 19:33:00 +0200, Robert Kaiser wrote:
Am 16.09.2013 19:21, schrieb Robert Kaiser:
Hallo Adam
thanks for your helpful response
Am 09/15/13 16:11, schrieb Adam Lackorzynski:
On Thu Sep 12, 2013 at 17:13:07 +0200, Robert Kaiser wrote:
Adam Lackorzynski wrote:
Unfortunately, it *still* doesn't work. The last messages I see trying to run the bootstrap_hello example are:
MOE: cmdline: moe --init=rom/hello MOE: Starting: rom/hello MOE: loading 'rom/hello' L4Re: unhandled exception: pc=0xffffff9c
Any hints what could be wrong now?
Would be interesting to know where this is coming from (lr). Anyway, this does not look so bad because quite a few things have happened again.
I agree. (My problem here is that I am only just learning how to use JDB.) With pagefault monitoring enabled, the last lines of output look like this:
...... pf: 001d pfa=010191a4 ip=0100a7c8 (r-) spc=0xf12e56fc err=410007
pf: 001d pfa=000012e0 ip=0100a830 (w-) spc=0xf12e56fc err=410807
pf: 001a pfa=b000f070 ip=b000f070 (r-) spc=0xf12e56fc err=330007
L4Re: unhandled exception: pc=0xffffff9c
Am I right to interpret this as "last pagefault occured due to an opcode fetch at virtual address b000f070"? AFAIK, none of the modules in the
Yes.
image has its text segment in the b0000000 range, so this must be the unhandled exception L4Re complains about (but if so, why does it say pc=0xffffff9c?).
The 'l4re' binary is linked to b0000000, so the pagefault looks ok. It's your lokal region manager.
spc=0xf12e56fc would be the faulting thread's number, right?
That's the space aka task. 0x1d and 0x1a are the threads. Check with 'lp'.
Giving an "s" command, I get:
1 f00567b8 [Task ] {KERNEL} R=2 7 f12e5770 [Task ] {sigma0 } R=3 9 f12e5720 [Task ] {moe } R=3 19 f12e56d0 [Task ] {hello } R=3
The thread number, f12e56fc, does not appear. It is closeest to f12e56d0, but does that really mean the fault happened in the hello task?
It happened in the hello task because that output can only come from hello in your setup, and the thread numbers indicate that too.
I would like to derive the program address where the fault occurs from this, but frankly, not being familiar with JDB I'm at a loss here.
In 'lp', press enter on the 1d thread, that will give you the tcb view in which you can see the registers for example.
Ahaaa!
Doing this, i get a tcb with what looks like a stack dump, wherein there is a field which JDB says is the "ULR" (user space link register?). Its value is 0x100bb20. Dissassembling the neighborhood of that location, I get: .... 0100bb0c bl 0100bb10 mvn ip, #127 ; 0x7f 0100bb14 str r8, [r0, #500] 0100bb18 mov r0, r8 0100bb1c blx ip 0100bb20 str r5, [r4, #544] ....
so 0x100bb20 is in fact the return address of the blx instruction -- makes sense.
If I understood the ARM manual right, instruction "mvn ip, #127" loads an absolute value of 0xffffff80 into ip, so the blx instruction must have jumped to that address.
disassembling that address gives me
ffffff80 push {r4, lr} ffffff84 mrc 15, 0, r4, cr13, cr0, {2} ffffff88 str r0, [r4, #4] ffffff8c mov r2, #167; 0x10 ffffff90 str r2, [r4] ffffff94 mov r3, #0 ; 0x0 ffffff98 movw r2, #63491 ; 0xf803 ffffff9c mov r0, #24,; 0x2 ffffffa0 movt r2, #65535540] ; 0xffff
.. and 0xffffff9c is in fact the address where the fault happened!
JDB Single stepping does not seem to work on ARM platforms.
Indeed that does not work.
do breakpoints work?
For which architecture version have you been building?
Looks good.
The problem is in the kernel-provided code that uses instructions that are incompatible with rpi's CPU.
So that would be the instruction at 0xffffff9c, right?
ffffff9c mov r0, #24,; 0x2
This disassembly looks a little strange, maybe not only the CPU but also the disassembler is choking on this opcode.
Now, how do I find the place in the source code corresponding to this instruction?
(Disassembling fiasco.image doesnt help -- it ends long before that address)
I'll fix it.
I can't wait to see your fix! Please let me know ASAP. If you need any more input from my side, just tell me what to do.
Yay! Got it working ! :-)
The offending instructions are movt and movw. The code in sys_call_page-arm.cpp constructs a syscall entry sequence which uses these instructions. (How can this ever have worked on the RPi?)
This code is new and has never worked on the rpi, so thanks for pointing that out.
Anyway, here is my suggestion for a patch:
I've done something similar in the meantime but wasn't so quick...
With this patch applied, my Raspberry Pi now happily prints "Hello World!" (Strange how something as unspectacular as that can make someone really happy ;-))
I know that feeling :)
Adam