Hello,
i'm currently trying to get Fiasco.OC + L4Re up and running on a Raspberry Pi, but no luck :-(
I have gotten as far as building a bootstrap_hello.elf file. According to previous posts here, I should just copy this file to the first partition of a linux-bootable SD card, add an entry like "kernel=bootstrap_hello.elf" to config.txt, and I should see some messages at the serial port as the Pi boots L4. Alas, my Pi stays silent.
If I then simply remove the added line from config.txt, leaving everything else the same, Linux boots from the card alright and I can see its messages on the serial port. Thus, I'm confident that my hardware set-up should be OK.
I am not absolutely sure wether I got the building of the boot file right. What surprises me is that it is an ELF file: I have previously followed the "Baking Pi" tutorial at http://www.cl.cam.ac.uk/projects/raspberrypi/tutorials/os/. There I used to boot *raw* binaries which expected to be loaded at 0x8000. In contrast, this file here displays as an ELF file with .text starting at 0x01000000. Is that how it should be?
The building process also creates a file named "bootstrap_hello", but that is simply a copy of bootstrap_hello.elf. What's the point of this?
Thanks a lot for any help
Robert Kaiser
Hi,
On Fri Sep 06, 2013 at 19:42:12 +0200, Robert Kaiser wrote:
i'm currently trying to get Fiasco.OC + L4Re up and running on a Raspberry Pi, but no luck :-(
I have gotten as far as building a bootstrap_hello.elf file. According to previous posts here, I should just copy this file to the first partition of a linux-bootable SD card, add an entry like "kernel=bootstrap_hello.elf" to config.txt, and I should see some messages at the serial port as the Pi boots L4. Alas, my Pi stays silent.
If I then simply remove the added line from config.txt, leaving everything else the same, Linux boots from the card alright and I can see its messages on the serial port. Thus, I'm confident that my hardware set-up should be OK.
I am not absolutely sure wether I got the building of the boot file right. What surprises me is that it is an ELF file: I have previously followed the "Baking Pi" tutorial at http://www.cl.cam.ac.uk/projects/raspberrypi/tutorials/os/. There I used to boot *raw* binaries which expected to be loaded at 0x8000. In contrast, this file here displays as an ELF file with .text starting at 0x01000000. Is that how it should be?
Using a rawimage also seems more reasonable to me. Where the code starts in memory doesn't really matter as long as it stays in RAM. 0x01000000 is the typical start address of L4 bootstrap. I personally used u-boot all the time instead of the built-in loader as this also allows to boot via tftp, however, iirc, the built-in one did also work but I can't check myself right now.
The building process also creates a file named "bootstrap_hello", but that is simply a copy of bootstrap_hello.elf. What's the point of this?
Add .elf is just a hint that this is an ELF file. There's also 'make rawimage' generating *.raw and 'make uimage' generating *.uimage formats (also see 'make help' for list of targets).
Adam
Hi,
thanks for your response.
Am 09/09/13 00:16, schrieb Adam Lackorzynski:
Hi,
On Fri Sep 06, 2013 at 19:42:12 +0200, Robert Kaiser wrote:
i'm currently trying to get Fiasco.OC + L4Re up and running on a Raspberry Pi, but no luck :-(
I have gotten as far as building a bootstrap_hello.elf file. According to previous posts here, I should just copy this file to the first partition of a linux-bootable SD card, add an entry like "kernel=bootstrap_hello.elf" to config.txt, and I should see some messages at the serial port as the Pi boots L4. Alas, my Pi stays silent.
If I then simply remove the added line from config.txt, leaving everything else the same, Linux boots from the card alright and I can see its messages on the serial port. Thus, I'm confident that my hardware set-up should be OK.
I am not absolutely sure wether I got the building of the boot file right. What surprises me is that it is an ELF file: I have previously followed the "Baking Pi" tutorial at http://www.cl.cam.ac.uk/projects/raspberrypi/tutorials/os/. There I used to boot *raw* binaries which expected to be loaded at 0x8000. In contrast, this file here displays as an ELF file with .text starting at 0x01000000. Is that how it should be?
Using a rawimage also seems more reasonable to me. Where the code starts in memory doesn't really matter as long as it stays in RAM. 0x01000000 is the typical start address of L4 bootstrap.
Just tried a rawimage -> still silence on the serial port :-(. It may be due a configuration mistake. I guess I'll have to wade through the code to find out where it goes wrong -- stay tuned.
I personally used u-boot all the time instead of the built-in loader as this also allows to boot via tftp, however, iirc, the built-in one did also work but I can't check myself right now.
Thats a good hint. I didn't know uboot supports network access on the Pi (I assumed the Pi's network interface is connected through USB, so uboot would have to incorporate at least a rudimentary USB stack). Will check this out.
The building process also creates a file named "bootstrap_hello", but that is simply a copy of bootstrap_hello.elf. What's the point of this?
Add .elf is just a hint that this is an ELF file.
I was just wondering why to create two files when they have identical content:
$ cmp -l bootstrap_hello bootstrap_hello.elf
There's also 'make rawimage' generating *.raw and 'make uimage' generating *.uimage formats (also see 'make help' for list of targets).
rawimage would be appropriate for the Pi loader and looking at the boot code I can see that it copies itself from wherever it has been loaded (e.g. 0x8000) to the address it has been built for (0x1000000).
So it I agree that it *should* work. Unfortunately, it doesn't work for me. I'll try to figure this out.
In the meantime: is anyone aware of any common pitfalls with the Raspberry Pi? Does anyone have access to a known-to-work config file? I'm attaching my globalconfig.out, maybe there is something obviously wrong with it.
Cheers
Robert
Hello Adam
Robert Kaiser wrote:
Just tried a rawimage -> still silence on the serial port :-(. It may be due a configuration mistake. I guess I'll have to wade through the code to find out where it goes wrong -- stay tuned.
I've tracked it down a little further:
Booting a rawimage basically works, i.e. the bootloader copies itself the right place in memory and starts executing. It gets to the point where it wants to print its first message -- and that's where it crashes :-(.
The crash happens during a division computed in fputs.c. The compiler calls a libgcc routine named __aeabi_uidiv which apparently throws an exception. (However it is *not* a divide by zero exception -- I have checked the values).
Time to try a different toolchain, I guess. So far, I have been using the linaro toolchain (gcc-linaro-arm-linux-gnueabihf-4.8-2013.08_linux). Can you please tell me which toolchain you have used (and where to obtain it)?
Cheers
Robert
Hello,
Robert Kaiser wrote:
Hello Adam
Robert Kaiser wrote:
Just tried a rawimage -> still silence on the serial port :-(. It may be due a configuration mistake. I guess I'll have to wade through the code to find out where it goes wrong -- stay tuned.
I've tracked it down a little further:
Booting a rawimage basically works, i.e. the bootloader copies itself the right place in memory and starts executing. It gets to the point where it wants to print its first message -- and that's where it crashes :-(.
The crash happens during a division computed in fputs.c. The compiler calls a libgcc routine named __aeabi_uidiv which apparently throws an exception. (However it is *not* a divide by zero exception -- I have checked the values).
Time to try a different toolchain, I guess. So far, I have been using the linaro toolchain (gcc-linaro-arm-linux-gnueabihf-4.8-2013.08_linux). Can you please tell me which toolchain you have used (and where to obtain it)?
OK, I read somewhere else that the Codesourcery toolchain is recommended. (You have to register with Mentor to obtain it which is why I initially chose the linaro toolchain). Anyway: I recompiled everthing with arm-2013.05/arm-none-linux-gnueabi. This fixed the crash problem, so the linaro toolchain is definitely broken!
Unfortunately, it *still* doesn't work. The last messages I see trying to run the bootstrap_hello example are:
MOE: cmdline: moe --init=rom/hello MOE: Starting: rom/hello MOE: loading 'rom/hello' L4Re: unhandled exception: pc=0xffffff9c
Any hints what could be wrong now?
Cheers
Robert
Hi Robert,
On Wed Sep 11, 2013 at 07:02:52 +0200, Robert Kaiser wrote:
Robert Kaiser wrote:
Hello Adam
Robert Kaiser wrote:
Just tried a rawimage -> still silence on the serial port :-(. It may be due a configuration mistake. I guess I'll have to wade through the code to find out where it goes wrong -- stay tuned.
I've tracked it down a little further:
Booting a rawimage basically works, i.e. the bootloader copies itself the right place in memory and starts executing. It gets to the point where it wants to print its first message -- and that's where it crashes :-(.
The crash happens during a division computed in fputs.c. The compiler calls a libgcc routine named __aeabi_uidiv which apparently throws an exception. (However it is *not* a divide by zero exception -- I have checked the values).
Time to try a different toolchain, I guess. So far, I have been using the linaro toolchain (gcc-linaro-arm-linux-gnueabihf-4.8-2013.08_linux). Can you please tell me which toolchain you have used (and where to obtain it)?
OK, I read somewhere else that the Codesourcery toolchain is recommended. (You have to register with Mentor to obtain it which is why I initially chose the linaro toolchain). Anyway: I recompiled everthing with arm-2013.05/arm-none-linux-gnueabi. This fixed the crash problem, so the linaro toolchain is definitely broken!
Ok, thanks for letting us know. Sometimes I also see somewhat strange behaviour in those toolchains but did not take the time to pinpoint anything. Definitely I used the 4.5-based codesourcery when I did the rpi. (But this definitely needs checking.)
Unfortunately, it *still* doesn't work. The last messages I see trying to run the bootstrap_hello example are:
MOE: cmdline: moe --init=rom/hello MOE: Starting: rom/hello MOE: loading 'rom/hello' L4Re: unhandled exception: pc=0xffffff9c
Any hints what could be wrong now?
Would be interesting to know where this is coming from (lr). Anyway, this does not look so bad because quite a few things have happened again. For which architecture version have you been building?
Adam
Hi Adam
Adam Lackorzynski wrote:
Unfortunately, it *still* doesn't work. The last messages I see trying to run the bootstrap_hello example are:
MOE: cmdline: moe --init=rom/hello MOE: Starting: rom/hello MOE: loading 'rom/hello' L4Re: unhandled exception: pc=0xffffff9c
Any hints what could be wrong now?
Would be interesting to know where this is coming from (lr). Anyway, this does not look so bad because quite a few things have happened again.
I agree. (My problem here is that I am only just learning how to use JDB.) With pagefault monitoring enabled, the last lines of output look like this:
...... pf: 001d pfa=010191a4 ip=0100a7c8 (r-) spc=0xf12e56fc err=410007
pf: 001d pfa=000012e0 ip=0100a830 (w-) spc=0xf12e56fc err=410807
pf: 001a pfa=b000f070 ip=b000f070 (r-) spc=0xf12e56fc err=330007
L4Re: unhandled exception: pc=0xffffff9c
Am I right to interpret this as "last pagefault occured due to an opcode fetch at virtual address b000f070"? AFAIK, none of the modules in the image has its text segment in the b0000000 range, so this must be the unhandled exception L4Re complains about (but if so, why does it say pc=0xffffff9c?).
spc=0xf12e56fc would be the faulting thread's number, right?
Giving an "s" command, I get:
1 f00567b8 [Task ] {KERNEL} R=2 7 f12e5770 [Task ] {sigma0 } R=3 9 f12e5720 [Task ] {moe } R=3 19 f12e56d0 [Task ] {hello } R=3
The thread number, f12e56fc, does not appear. It is closeest to f12e56d0, but does that really mean the fault happened in the hello task?
Selecting the hello task with the cursor, i get:
1 f00567b8 [Task ] {KERNEL} R=2 7 f12e5770 [Task ] {sigma0 } R=3 9 f12e5720 [Task ] {moe } R=3 Space 0xf12e56d0 (Kobject*)0xf12e56d0 } R=3 utcb area: user_va=0xb3000000 kernel_va=0xf11a2000 size=2000 mem usage: 235824 (230KB) of -1 (4194303KB) @0xf0056474
I would like to derive the program address where the fault occurs from this, but frankly, not being familiar with JDB I'm at a loss here.
JDB Single stepping does not seem to work on ARM platforms.
For which architecture version have you been building?
in fiasco: Broadcom 2835: CONFIG_PF_BCM2835=y CONFIG_ARM_1176=y CONFIG_ARM_V6=y CONFIG_ARM_V6PLUS=y
in L4Re: armv6: CONFIG_CPU_ARMV6=y CONFIG_CPU="armv6" CONFIG_CPU_ARMV6PLUS=y
Is that correct?
Thanks a lot for your help!
Cheers
Robert
On Thu Sep 12, 2013 at 17:13:07 +0200, Robert Kaiser wrote:
Adam Lackorzynski wrote:
Unfortunately, it *still* doesn't work. The last messages I see trying to run the bootstrap_hello example are:
MOE: cmdline: moe --init=rom/hello MOE: Starting: rom/hello MOE: loading 'rom/hello' L4Re: unhandled exception: pc=0xffffff9c
Any hints what could be wrong now?
Would be interesting to know where this is coming from (lr). Anyway, this does not look so bad because quite a few things have happened again.
I agree. (My problem here is that I am only just learning how to use JDB.) With pagefault monitoring enabled, the last lines of output look like this:
...... pf: 001d pfa=010191a4 ip=0100a7c8 (r-) spc=0xf12e56fc err=410007
pf: 001d pfa=000012e0 ip=0100a830 (w-) spc=0xf12e56fc err=410807
pf: 001a pfa=b000f070 ip=b000f070 (r-) spc=0xf12e56fc err=330007
L4Re: unhandled exception: pc=0xffffff9c
Am I right to interpret this as "last pagefault occured due to an opcode fetch at virtual address b000f070"? AFAIK, none of the modules in the
Yes.
image has its text segment in the b0000000 range, so this must be the unhandled exception L4Re complains about (but if so, why does it say pc=0xffffff9c?).
The 'l4re' binary is linked to b0000000, so the pagefault looks ok. It's your lokal region manager.
spc=0xf12e56fc would be the faulting thread's number, right?
That's the space aka task. 0x1d and 0x1a are the threads. Check with 'lp'.
Giving an "s" command, I get:
1 f00567b8 [Task ] {KERNEL} R=2 7 f12e5770 [Task ] {sigma0 } R=3 9 f12e5720 [Task ] {moe } R=3 19 f12e56d0 [Task ] {hello } R=3
The thread number, f12e56fc, does not appear. It is closeest to f12e56d0, but does that really mean the fault happened in the hello task?
It happened in the hello task because that output can only come from hello in your setup, and the thread numbers indicate that too.
I would like to derive the program address where the fault occurs from this, but frankly, not being familiar with JDB I'm at a loss here.
In 'lp', press enter on the 1d thread, that will give you the tcb view in which you can see the registers for example.
JDB Single stepping does not seem to work on ARM platforms.
Indeed that does not work.
For which architecture version have you been building?
Looks good.
The problem is in the kernel-provided code that uses instructions that are incompatible with rpi's CPU. I'll fix it.
Adam
Hallo Adam
thanks for your helpful response
Am 09/15/13 16:11, schrieb Adam Lackorzynski:
On Thu Sep 12, 2013 at 17:13:07 +0200, Robert Kaiser wrote:
Adam Lackorzynski wrote:
Unfortunately, it *still* doesn't work. The last messages I see trying to run the bootstrap_hello example are:
MOE: cmdline: moe --init=rom/hello MOE: Starting: rom/hello MOE: loading 'rom/hello' L4Re: unhandled exception: pc=0xffffff9c
Any hints what could be wrong now?
Would be interesting to know where this is coming from (lr). Anyway, this does not look so bad because quite a few things have happened again.
I agree. (My problem here is that I am only just learning how to use JDB.) With pagefault monitoring enabled, the last lines of output look like this:
...... pf: 001d pfa=010191a4 ip=0100a7c8 (r-) spc=0xf12e56fc err=410007
pf: 001d pfa=000012e0 ip=0100a830 (w-) spc=0xf12e56fc err=410807
pf: 001a pfa=b000f070 ip=b000f070 (r-) spc=0xf12e56fc err=330007
L4Re: unhandled exception: pc=0xffffff9c
Am I right to interpret this as "last pagefault occured due to an opcode fetch at virtual address b000f070"? AFAIK, none of the modules in the
Yes.
image has its text segment in the b0000000 range, so this must be the unhandled exception L4Re complains about (but if so, why does it say pc=0xffffff9c?).
The 'l4re' binary is linked to b0000000, so the pagefault looks ok. It's your lokal region manager.
spc=0xf12e56fc would be the faulting thread's number, right?
That's the space aka task. 0x1d and 0x1a are the threads. Check with 'lp'.
Giving an "s" command, I get:
1 f00567b8 [Task ] {KERNEL} R=2 7 f12e5770 [Task ] {sigma0 } R=3 9 f12e5720 [Task ] {moe } R=3 19 f12e56d0 [Task ] {hello } R=3
The thread number, f12e56fc, does not appear. It is closeest to f12e56d0, but does that really mean the fault happened in the hello task?
It happened in the hello task because that output can only come from hello in your setup, and the thread numbers indicate that too.
I would like to derive the program address where the fault occurs from this, but frankly, not being familiar with JDB I'm at a loss here.
In 'lp', press enter on the 1d thread, that will give you the tcb view in which you can see the registers for example.
Ahaaa!
Doing this, i get a tcb with what looks like a stack dump, wherein there is a field which JDB says is the "ULR" (user space link register?). Its value is 0x100bb20. Dissassembling the neighborhood of that location, I get: .... 0100bb0c bl 0100bb10 mvn ip, #127 ; 0x7f 0100bb14 str r8, [r0, #500] 0100bb18 mov r0, r8 0100bb1c blx ip 0100bb20 str r5, [r4, #544] ....
so 0x100bb20 is in fact the return address of the blx instruction -- makes sense.
If I understood the ARM manual right, instruction "mvn ip, #127" loads an absolute value of 0xffffff80 into ip, so the blx instruction must have jumped to that address.
disassembling that address gives me
ffffff80 push {r4, lr} ffffff84 mrc 15, 0, r4, cr13, cr0, {2} ffffff88 str r0, [r4, #4] ffffff8c mov r2, #167; 0x10 ffffff90 str r2, [r4] ffffff94 mov r3, #0 ; 0x0 ffffff98 movw r2, #63491 ; 0xf803 ffffff9c mov r0, #24,; 0x2 ffffffa0 movt r2, #65535540] ; 0xffff
.. and 0xffffff9c is in fact the address where the fault happened!
JDB Single stepping does not seem to work on ARM platforms.
Indeed that does not work.
do breakpoints work?
For which architecture version have you been building?
Looks good.
The problem is in the kernel-provided code that uses instructions that are incompatible with rpi's CPU.
So that would be the instruction at 0xffffff9c, right?
ffffff9c mov r0, #24,; 0x2
This disassembly looks a little strange, maybe not only the CPU but also the disassembler is choking on this opcode.
Now, how do I find the place in the source code corresponding to this instruction?
(Disassembling fiasco.image doesnt help -- it ends long before that address)
I'll fix it.
I can't wait to see your fix! Please let me know ASAP. If you need any more input from my side, just tell me what to do.
Cheers
Robert
Adam
Hi,
Am 16.09.2013 19:21, schrieb Robert Kaiser:
Hallo Adam
thanks for your helpful response
Am 09/15/13 16:11, schrieb Adam Lackorzynski:
On Thu Sep 12, 2013 at 17:13:07 +0200, Robert Kaiser wrote:
Adam Lackorzynski wrote:
Unfortunately, it *still* doesn't work. The last messages I see trying to run the bootstrap_hello example are:
MOE: cmdline: moe --init=rom/hello MOE: Starting: rom/hello MOE: loading 'rom/hello' L4Re: unhandled exception: pc=0xffffff9c
Any hints what could be wrong now?
Would be interesting to know where this is coming from (lr). Anyway, this does not look so bad because quite a few things have happened again.
I agree. (My problem here is that I am only just learning how to use JDB.) With pagefault monitoring enabled, the last lines of output look like this:
...... pf: 001d pfa=010191a4 ip=0100a7c8 (r-) spc=0xf12e56fc err=410007
pf: 001d pfa=000012e0 ip=0100a830 (w-) spc=0xf12e56fc err=410807
pf: 001a pfa=b000f070 ip=b000f070 (r-) spc=0xf12e56fc err=330007
L4Re: unhandled exception: pc=0xffffff9c
Am I right to interpret this as "last pagefault occured due to an opcode fetch at virtual address b000f070"? AFAIK, none of the modules in the
Yes.
image has its text segment in the b0000000 range, so this must be the unhandled exception L4Re complains about (but if so, why does it say pc=0xffffff9c?).
The 'l4re' binary is linked to b0000000, so the pagefault looks ok. It's your lokal region manager.
spc=0xf12e56fc would be the faulting thread's number, right?
That's the space aka task. 0x1d and 0x1a are the threads. Check with 'lp'.
Giving an "s" command, I get:
1 f00567b8 [Task ] {KERNEL} R=2 7 f12e5770 [Task ] {sigma0 } R=3 9 f12e5720 [Task ] {moe } R=3 19 f12e56d0 [Task ] {hello } R=3
The thread number, f12e56fc, does not appear. It is closeest to f12e56d0, but does that really mean the fault happened in the hello task?
It happened in the hello task because that output can only come from hello in your setup, and the thread numbers indicate that too.
I would like to derive the program address where the fault occurs from this, but frankly, not being familiar with JDB I'm at a loss here.
In 'lp', press enter on the 1d thread, that will give you the tcb view in which you can see the registers for example.
Ahaaa!
Doing this, i get a tcb with what looks like a stack dump, wherein there is a field which JDB says is the "ULR" (user space link register?). Its value is 0x100bb20. Dissassembling the neighborhood of that location, I get: .... 0100bb0c bl 0100bb10 mvn ip, #127 ; 0x7f 0100bb14 str r8, [r0, #500] 0100bb18 mov r0, r8 0100bb1c blx ip 0100bb20 str r5, [r4, #544] ....
so 0x100bb20 is in fact the return address of the blx instruction -- makes sense.
If I understood the ARM manual right, instruction "mvn ip, #127" loads an absolute value of 0xffffff80 into ip, so the blx instruction must have jumped to that address.
disassembling that address gives me
ffffff80 push {r4, lr} ffffff84 mrc 15, 0, r4, cr13, cr0, {2} ffffff88 str r0, [r4, #4] ffffff8c mov r2, #167; 0x10 ffffff90 str r2, [r4] ffffff94 mov r3, #0 ; 0x0 ffffff98 movw r2, #63491 ; 0xf803 ffffff9c mov r0, #24,; 0x2 ffffffa0 movt r2, #65535540] ; 0xffff
.. and 0xffffff9c is in fact the address where the fault happened!
JDB Single stepping does not seem to work on ARM platforms.
Indeed that does not work.
do breakpoints work?
For which architecture version have you been building?
Looks good.
The problem is in the kernel-provided code that uses instructions that are incompatible with rpi's CPU.
So that would be the instruction at 0xffffff9c, right?
ffffff9c mov r0, #24,; 0x2
This disassembly looks a little strange, maybe not only the CPU but also the disassembler is choking on this opcode.
Now, how do I find the place in the source code corresponding to this instruction?
(Disassembling fiasco.image doesnt help -- it ends long before that address)
I'll fix it.
I can't wait to see your fix! Please let me know ASAP. If you need any more input from my side, just tell me what to do.
Yay! Got it working ! :-)
The offending instructions are movt and movw. The code in sys_call_page-arm.cpp constructs a syscall entry sequence which uses these instructions. (How can this ever have worked on the RPi?)
Anyway, here is my suggestion for a patch:
--- src/kernel/fiasco/src/kern/arm/sys_call_page-arm.cpp.orig 2013-09-17 19:10:18.773107154 +0200 +++ src/kernel/fiasco/src/kern/arm/sys_call_page-arm.cpp 2013-09-17 19:10:18.773107154 +0200 @@ -40,10 +40,20 @@ sys_calls[offset++] = 0xe3a02010; // mov r2, #0x10 -> set tls opcode sys_calls[offset++] = 0xe5842000; // str r2, [r4] sys_calls[offset++] = 0xe3a03000; // mov r3, #0 +#ifdef CONFIG_ARM_1176 + sys_calls[offset++] = 0xe3a02003; // mov r2, #3 + sys_calls[offset++] = 0xe3a00002; // mov r0, #2 + sys_calls[offset++] = 0xe38224ff; // orr r2, #0xff000000 + sys_calls[offset++] = 0xe38004ff; // orr r0, #0xff000000 + sys_calls[offset++] = 0xe38228ff; // orr r2, #0x00ff0000 + sys_calls[offset++] = 0xe380073d; // orr r0, #0x00f40000 + sys_calls[offset++] = 0xe3822b3e; // orr r2, #0x0000f800 +#else sys_calls[offset++] = 0xe30f2803; // movw r2, #0xf803 sys_calls[offset++] = 0xe3a00002; // mov r0, #2 sys_calls[offset++] = 0xe34f2fff; // movt r2, #0xffff sys_calls[offset++] = 0xe34f0ff4; // movt r0, #0xfff4 +#endif sys_calls[offset++] = 0xe1a0e00f; // mov lr, pc sys_calls[offset++] = 0xe3e0f00b; // mvn pc, #11 sys_calls[offset++] = 0xe8bd8010; // pop {r4, pc}
With this patch applied, my Raspberry Pi now happily prints "Hello World!" (Strange how something as unspectacular as that can make someone really happy ;-))
Is the patch OK? If so: please apply.
Thanks, Adam, for your help! Without it, I would never have found this.
Cheers
Robert
On Tue Sep 17, 2013 at 19:33:00 +0200, Robert Kaiser wrote:
Am 16.09.2013 19:21, schrieb Robert Kaiser:
Hallo Adam
thanks for your helpful response
Am 09/15/13 16:11, schrieb Adam Lackorzynski:
On Thu Sep 12, 2013 at 17:13:07 +0200, Robert Kaiser wrote:
Adam Lackorzynski wrote:
Unfortunately, it *still* doesn't work. The last messages I see trying to run the bootstrap_hello example are:
MOE: cmdline: moe --init=rom/hello MOE: Starting: rom/hello MOE: loading 'rom/hello' L4Re: unhandled exception: pc=0xffffff9c
Any hints what could be wrong now?
Would be interesting to know where this is coming from (lr). Anyway, this does not look so bad because quite a few things have happened again.
I agree. (My problem here is that I am only just learning how to use JDB.) With pagefault monitoring enabled, the last lines of output look like this:
...... pf: 001d pfa=010191a4 ip=0100a7c8 (r-) spc=0xf12e56fc err=410007
pf: 001d pfa=000012e0 ip=0100a830 (w-) spc=0xf12e56fc err=410807
pf: 001a pfa=b000f070 ip=b000f070 (r-) spc=0xf12e56fc err=330007
L4Re: unhandled exception: pc=0xffffff9c
Am I right to interpret this as "last pagefault occured due to an opcode fetch at virtual address b000f070"? AFAIK, none of the modules in the
Yes.
image has its text segment in the b0000000 range, so this must be the unhandled exception L4Re complains about (but if so, why does it say pc=0xffffff9c?).
The 'l4re' binary is linked to b0000000, so the pagefault looks ok. It's your lokal region manager.
spc=0xf12e56fc would be the faulting thread's number, right?
That's the space aka task. 0x1d and 0x1a are the threads. Check with 'lp'.
Giving an "s" command, I get:
1 f00567b8 [Task ] {KERNEL} R=2 7 f12e5770 [Task ] {sigma0 } R=3 9 f12e5720 [Task ] {moe } R=3 19 f12e56d0 [Task ] {hello } R=3
The thread number, f12e56fc, does not appear. It is closeest to f12e56d0, but does that really mean the fault happened in the hello task?
It happened in the hello task because that output can only come from hello in your setup, and the thread numbers indicate that too.
I would like to derive the program address where the fault occurs from this, but frankly, not being familiar with JDB I'm at a loss here.
In 'lp', press enter on the 1d thread, that will give you the tcb view in which you can see the registers for example.
Ahaaa!
Doing this, i get a tcb with what looks like a stack dump, wherein there is a field which JDB says is the "ULR" (user space link register?). Its value is 0x100bb20. Dissassembling the neighborhood of that location, I get: .... 0100bb0c bl 0100bb10 mvn ip, #127 ; 0x7f 0100bb14 str r8, [r0, #500] 0100bb18 mov r0, r8 0100bb1c blx ip 0100bb20 str r5, [r4, #544] ....
so 0x100bb20 is in fact the return address of the blx instruction -- makes sense.
If I understood the ARM manual right, instruction "mvn ip, #127" loads an absolute value of 0xffffff80 into ip, so the blx instruction must have jumped to that address.
disassembling that address gives me
ffffff80 push {r4, lr} ffffff84 mrc 15, 0, r4, cr13, cr0, {2} ffffff88 str r0, [r4, #4] ffffff8c mov r2, #167; 0x10 ffffff90 str r2, [r4] ffffff94 mov r3, #0 ; 0x0 ffffff98 movw r2, #63491 ; 0xf803 ffffff9c mov r0, #24,; 0x2 ffffffa0 movt r2, #65535540] ; 0xffff
.. and 0xffffff9c is in fact the address where the fault happened!
JDB Single stepping does not seem to work on ARM platforms.
Indeed that does not work.
do breakpoints work?
For which architecture version have you been building?
Looks good.
The problem is in the kernel-provided code that uses instructions that are incompatible with rpi's CPU.
So that would be the instruction at 0xffffff9c, right?
ffffff9c mov r0, #24,; 0x2
This disassembly looks a little strange, maybe not only the CPU but also the disassembler is choking on this opcode.
Now, how do I find the place in the source code corresponding to this instruction?
(Disassembling fiasco.image doesnt help -- it ends long before that address)
I'll fix it.
I can't wait to see your fix! Please let me know ASAP. If you need any more input from my side, just tell me what to do.
Yay! Got it working ! :-)
The offending instructions are movt and movw. The code in sys_call_page-arm.cpp constructs a syscall entry sequence which uses these instructions. (How can this ever have worked on the RPi?)
This code is new and has never worked on the rpi, so thanks for pointing that out.
Anyway, here is my suggestion for a patch:
I've done something similar in the meantime but wasn't so quick...
With this patch applied, my Raspberry Pi now happily prints "Hello World!" (Strange how something as unspectacular as that can make someone really happy ;-))
I know that feeling :)
Adam
l4-hackers@os.inf.tu-dresden.de