Booting L4Re on the CI20: Panic in sigma0

Sarah Hoffmann sarah.hoffmann at kernkonzept.com
Tue Jul 18 08:58:14 CEST 2017


Hi Paul,

On 07/18/2017 01:43 AM, Paul Boddie wrote:
> Well, I haven't really figured this out at all. I thought it might be useful 
> to investigate what the message actually represents. First of all, it 
> originates from the Dispatcher::dispatch method in...
> 
> pkg/l4re-core/l4re_kernel/server/src/dispatcher.cc
> 
> I think the "tag" breaks down into something like this:
> 
>   0xfffb1026 -> label=0xfffb (-5) -> L4_PROTO_EXCEPTION
>                 flags=0x1 -> L4_MSGTAG_TRANSFER_FPU
>                 items=0x00
>                 words=0x26 (38)
> 
> (Reference: pkg/l4re-core/l4sys/include/types.h)
> 
> The code performing this logging doesn't indicate what the result of the 
> message dispatch was, so I added a trace statement to see, which yielded this 
> "tag" information:
> 
>   0xfc170000 -> label=0xfc17 (-1001) -> L4_EMSGTOOSHORT
>                 flags=0x0
>                 items=0x00
>                 words=0x00

There is a bug in the Fiasco where it sends the wrong message size.
Please apply the attached patch to Fiasco. Afterwards you should get
more useful error messages in your L4 applications when it throws
exceptions.

> Attempting to determine the nature of the supposed exception, I managed to 
> discover that...
> 
> l4_utcb_exc_is_pf returns 1 (page fault)
> l4_utcb_exc_pfa returns 0x800d1308 (which is a kernel mode address on MIPS)
> 
> The program counter is given as 0x7000049c, with the exception cause being 
> decoded from 0x10 to be interpreted as an "exception code value" of 4 in the 
> CP0_CAUSE register (address error, load or instruction fetch).
> 
> I thought that enabling more logging in sigma0 might help, presuming that the 
> page fault would be propagated through the pager hierarchy. But changing 
> debug_ipc to 1 in...

An address error generally means that you are trying to access a bad
address (which would be the case with the PFA given above). This is
different from a normal page fault, which corresponds to TLB exceptions.
That is why sigma0 is not involved.

Exceptions are directly sent to the exception handler which in a
standard L4 application is the thread started first (l4re-kernel thread)
or, if that one fails, the launcher (moe in your case).

> ...indicated that sigma0 is not involved when the above requests are made and 
> dispatched: there is a lot of logging from sigma0, but logging from Moe takes 
> over after a certain point. And I don't see any logging from Moe when these 
> page faults occur: they are described using the "L4Re[svr]" prefix, as shown 
> above.
> 
> So, I don't really have much more to go on, here. There's a chance that my 
> rdhwr instruction support introduced a bug, I suppose, even though I've read 
> through that code several times and can't see anything obviously wrong with 
> it. I do wonder whether the initialisation routine for other programs is 
> initialising the t9 register improperly, as noted previously.

The t9 issue is a likely cause. There are a couple of places where
.cpload is used.


> But then, I 
> don't understand why the erroneously-initialised program isn't just terminated 
> when its page fault can't be handled.

That is the standard behaviour and the attached patch hopefully brings
it back.

Kind regards

Sarah

> 
> Although I've seen a fair amount of the L4Re internals now, I don't think I 
> have any productive way of finding the problem here, unfortunately. I guess 
> this exercise has provided some way of getting a "tour" of the framework, and 
> maybe that will be useful in the future, but I had hoped that this board was 
> already supported to the point of already running the example programs.
> 
> Paul
> 
> _______________________________________________
> l4-hackers mailing list
> l4-hackers at os.inf.tu-dresden.de
> http://os.inf.tu-dresden.de/mailman/listinfo/l4-hackers
> 


-- 
Sarah Hoffmann, sarah.hoffmann at kernkonzept.com

Kernkonzept GmbH, Dresden, Germany
https://kernkonzept.com/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: fix-msg-size.patch
Type: text/x-diff
Size: 301 bytes
Desc: not available
URL: <http://os.inf.tu-dresden.de/pipermail/l4-hackers/attachments/20170718/79f9856c/attachment-0001.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <http://os.inf.tu-dresden.de/pipermail/l4-hackers/attachments/20170718/79f9856c/attachment-0001.asc>


More information about the l4-hackers mailing list