On Thu Jul 20, 2017 at 22:10:48 +0200, Paul Boddie wrote:
On Wednesday 19. July 2017 19.40.23 Paul Boddie wrote:
It always seems to involve an address of 0x8, which seems rather bizarre. Again, I think I must be missing something fundamental and must only be seeing the consequences.
So, I adjusted the kernel code, putting back in a commented-out debugging statement found in the Thread::handle_page_fault method which looks like this (having changed some of the details):
printf("Translation error ? %p\n" " is_kmem_page_fault ? %x\n" " is_sigma0 ? %x\n" " program counter: %p\n" " regs->ip(): %p\n" " page fault address: %p\n", (void *) PF::is_translation_error(error_code), !PF::is_translation_error(error_code) && mem_space()->is_sigma0(), Kmem::is_kmem_page_fault(pfa, error_code), (void *) pc, (void *) regs->ip(), (void *) pfa);
I also introduced a statement in Thread::handle_page_fault_pager as follows:
printf("handle_page_fault_pager: pfa=" L4_PTR_FMT ", errorcode=" L4_PTR_FMT ", pc=%lx, bad_v_addr=%lx\n", pfa, error_code, regs()->ip(), regs()->bad_v_addr);
I then observe some strange behaviour:
Translation error ? 0x1 is_kmem_page_fault ? 0 is_sigma0 ? 0 program counter: 0x80019c8c regs->ip(): 0x80019c8c page fault address: 0xc regs->bad_v_addr: 0xc handle_page_fault_pager: pfa=0000000c, errorcode=00000009, pc=103502c, bad_v_addr=8cc4 L4Re[svr]: request: tag=0xfffe0002 proto=-2 obj=0x0 L4Re: page fault: 9 pc=103502c L4Re[rm]: unhandled read page fault at 0x8 pc=0x103502c
In the above, the last three lines are normal debugging output. The (wrapped) line above those is from my statement in handle_page_fault_pager.
For some reason, the presumably correct bad_v_addr (bad virtual address, 0x8cc4) arising in the apparent initial page fault (at 0x0103502c) does not get propagated back to L4Re alongside the associated program counter value. Instead, 0x8 gets reported in the L4Re logging output.
While handling this page fault, there appears to be another page fault in the kernel (at 0x80019c8c). This latter fault can't be handled (as discussed below) and so the original exception is eventually exposed in L4Re with the confused mix of details noted above.
The unlikely address of 0x8 reported by L4Re may be related to the kernel fault address of 0xc, which according to the above details occurs in the following code (found in Ram_quota::alloc):
That looks like you should use the patch in http://os.inf.tu-dresden.de/pipermail/l4-hackers/2017/008005.html
Adam