Re: Booting L4Re on the CI20: Panic in sigma0

21 Jul 2017


      On Thu Jul 20, 2017 at 22:10:48 +0200, Paul Boddie wrote:
...
On Wednesday 19. July 2017 19.40.23 Paul Boddie wrote:
...
It always seems to involve an address of 0x8, which seems rather bizarre.
Again, I think I must be missing something fundamental and must only be
seeing the consequences.
So, I adjusted the kernel code, putting back in a commented-out debugging 
statement found in the Thread::handle_page_fault method which looks like this 
(having changed some of the details):
printf("Translation error ? %p\n"
         "  is_kmem_page_fault ? %x\n"
         "  is_sigma0 ? %x\n"
         "  program counter: %p\n"
         "  regs->ip(): %p\n"
         "  page fault address: %p\n",
         (void *) PF::is_translation_error(error_code),
         !PF::is_translation_error(error_code) && mem_space()->is_sigma0(),
         Kmem::is_kmem_page_fault(pfa, error_code),
         (void *) pc,
         (void *) regs->ip(),
         (void *) pfa);
I also introduced a statement in Thread::handle_page_fault_pager as follows:
printf("handle_page_fault_pager: pfa=" L4_PTR_FMT
         ", errorcode=" L4_PTR_FMT ", pc=%lx, bad_v_addr=%lx\n",
         pfa, error_code, regs()->ip(), regs()->bad_v_addr);
I then observe some strange behaviour:
Translation error ? 0x1
  is_kmem_page_fault ? 0
  is_sigma0 ? 0
  program counter: 0x80019c8c
  regs->ip(): 0x80019c8c
  page fault address: 0xc
  regs->bad_v_addr: 0xc
handle_page_fault_pager: pfa=0000000c, errorcode=00000009, pc=103502c, 
bad_v_addr=8cc4
L4Re[svr]: request: tag=0xfffe0002 proto=-2 obj=0x0
L4Re: page fault: 9 pc=103502c
L4Re[rm]: unhandled read page fault at 0x8 pc=0x103502c
In the above, the last three lines are normal debugging output. The (wrapped) 
line above those is from my statement in handle_page_fault_pager.
For some reason, the presumably correct bad_v_addr (bad virtual address, 
0x8cc4) arising in the apparent initial page fault (at 0x0103502c) does not 
get propagated back to L4Re alongside the associated program counter value. 
Instead, 0x8 gets reported in the L4Re logging output.
While handling this page fault, there appears to be another page fault in the 
kernel (at 0x80019c8c). This latter fault can't be handled (as discussed 
below) and so the original exception is eventually exposed in L4Re with the 
confused mix of details noted above.
The unlikely address of 0x8 reported by L4Re may be related to the kernel 
fault address of 0xc, which according to the above details occurs in the 
following code (found in Ram_quota::alloc):
That looks like you should use the patch in
http://os.inf.tu-dresden.de/pipermail/l4-hackers/2017/008005.html


Adam

Re: Booting L4Re on the CI20: Panic in sigma0

Adam Lackorzynski