Hi Volkmar and Kevin,
trying to get l4ka running under bochs would probably require slight modifications to the bochs sources, so that 4 MB pages (bit 4 PSE of CR4) can be supported (please see summary below).
------------------ BoSnip -------------------
// Only allow writes of 0 to CR4 for now. // Writes to bits in CR4 should not be 1s as CPUID // returns not-supported for all of these features.
------------------ EoSnip -------------------
So what happens is that we initialize 4MB pages (located in the first level page dirs). If bochs does not support 4MB pages - you are pretty much out of luck - we currently do not support 4K pages only. The kernels virtual view of physical memory is mapped as 4MB pages. Furthermore, RMGR and L4Linux assumes 4MB/Super-pages. So if you have absolutely no possibility to get a normal Pentium System - we have to do some mid-major :) changes of the startup code.
I don't hope that it would be necessary to tweak the l4ka sources so much (at least not for that 4MB problem). Looking at the sources of bochs, I think that it would be _much_ easier to add support for the PSE bit (bit 4 of CR4) to bochs itself at two places, rather than modifying l4ka. This is what I've found out so far (sorry for lengthy message):
---------- cut here ----------- cut here ------------- cut here -------------- Summary: ========
1. About superpages, PSE bit etc... - - - - - - - - - - - - - - - - - -
Intel Architecture Software Developer's Manual Volume 3: System Programming Order-Nr: 243192 http://developer.intel.com/design/pentiumii/manuals/24319202.pdf
Section 3.6 Paging (Virtual Memory):
PG-Flag (Bit 31 of CR0) enables paging translation, >= 386 PSE-Flag (Bit 4 of CR4) 4k or 4MB pages, >= 586 PAE-Flag (Bit 5 of CR4) 36Bit physical addr, >= 686
Section 3.6.2.1 Linear Address Translation (4 kByte Pages) Section 3.6.2.2 Linear Address Translation (4 MByte Pages) Section 3.6.2.3 Mixing 4kByte and 4MByte pages
Section 3.8.2 Linear Address Translation With Extended Adressing Enabled (2 MByte or 4 MByte Pages).
====> PSE bit is only used while translating linear addresses to physical ====> addresses. paging directory entries and page table entries have slighly ====> different format (?) when 4 MB pages are involved. See figures ====> 3-14, 3-15, 3-18, 3-19, 3-20, 3-21.
====> In bochs, this is mainly located in the file: ====> bochs-2000_0325a/cpu/paging.cc
2. L4ka sources - - - - - - - -
2.1: l4-ka/kernel/src/x86/init.c:init_paging() ..............................................
#if defined(CONFIG_ARCH_X86_I586) || defined(CONFIG_ARCH_X86_I686) /* Turn on super pages. */ enable_super_pages(); #endif
====> I configured with CONFIG_ARCH_X86_I586 to avoid having to: ====> #if defined(CONFIG_ARCH_X86_I686) ====> enable_global_pages(); ====> setup_sysenter_msrs(); ====> #endif
2.2: l4-ka/kernel/include/x86/cpu.h:enable_super_pages() ........................................................
INLINE void enable_super_pages() { __asm__ __volatile__ ("mov %%cr4, %%eax\n" "orl $0x10, %%eax\n" "mov %%eax, %%cr4\n" : : : "eax"); }
====> Turning on bit 4 of CR4 (PSE) enables 4 MB pages. ====> Volkmar: In Section 3.6.2.2 (at the End), the note says ====> Volkmar: that the TLBs _must_ be flushed (invalidated) after setting ====> Volkmar: or clearing the PSE bit. Forgot to do it?
====> This is the instruction that crashes bochs.
3. Bochs sources: - - - - - - - - -
3.1 output of bochs.out logfile: ................................
WBINVD: (ignoring) MOV_RdCd: read of CR4 MOV_CdRd: ignoring write to CR4 of 0x00000010 bochs: panic, MOV_CdRd: (CR4) write of 0x00000010
mov CR4, EAX
====> Bochs panic()s because it doesn't support CR4 bits yet (see below). ====> l4ka (enable_super_pages()) tries to set the PSE bit in CR4 because ====> it needs 4 MB superpages.
3.2 bochs-2000_0325a/cpu/proc_ctrl.cc: ......................................
This file contains the implementation of the x86-instructions. Let's look at the missing/buggy parts that were reported in bochs.out above:
3.2.1: WBINVD: (ignoring): . . . . . . . . . . . . .
void BX_CPU_C::WBINVD(BxInstruction_t *i) { bx_printf("WBINVD: (ignoring)\n");
#if BX_CPU_LEVEL >= 4 invalidate_prefetch_q();
if (BX_CPU_THIS_PTR cr0.pe) { if (CPL!=0) { bx_printf("WBINVD: CPL!=0\n"); exception(BX_GP_EXCEPTION, 0, 0); } } BX_INSTR_CACHE_CNTRL(BX_INSTR_WBINVD); #else UndefinedOpcode(i); #endif }
====> BX_CPU_LEVEL was defined to be 5, so invalidate_prefetch_q() ====> was called. WBINVD (invalidates TLB etc...) seems to be okay.
3.2.2: read of CR4: . . . . . . . . . .
void BX_CPU_C::MOV_RdCd(BxInstruction_t *i) { // mov control register data to register <snip> case 4: // CR4 #if BX_CPU_LEVEL == 3 val_32 = 0; bx_printf("MOV_RdCd: read of CR4 causes #UD\n"); UndefinedOpcode(i); #else bx_printf("MOV_RdCd: read of CR4\n"); val_32 = BX_CPU_THIS_PTR cr4; #endif break; <snip> }
====> This seems to be okay once more. The contents of the variable ====> cr4 can be read on this BX_CPU_LEVEL of 5.
3.2.3: ignoring write to CR4 of 0x00000010....., then panic() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
void BX_CPU_C::MOV_CdRd(BxInstruction_t *i) { // mov general register data to control register <snip> case 4: // CR4 #if BX_CPU_LEVEL == 3 bx_panic("MOV_CdRd: write to CR4 of 0x%08x on 386\n", val_32); UndefinedOpcode(i); #else // Protected mode: #GP(0) if attempt to write a 1 to // any reserved bit of CR4
bx_printf("MOV_CdRd: ignoring write to CR4 of 0x%08x\n", val_32); if (val_32) { bx_panic("MOV_CdRd: (CR4) write of 0x%08x\n", val_32); } // Only allow writes of 0 to CR4 for now. // Writes to bits in CR4 should not be 1s as CPUID // returns not-supported for all of these features. BX_CPU_THIS_PTR cr4 = 0; #endif break; <snip> }
====> This has to be tweaked, perhaps by adding a call to a new ====> CR4_change() function (like CR3_change()), which can ====> then be added to paging.cc. If the change is minimal, ====> modifying variable cr4 directly may be enough. Anyway, ====> in paging.cc, the translation functions must take into ====> account the PSE bit it cr4!!!
3.3 bochs-2000_0325a/cpu/paging.cc ..................................
// Translate a linear address to a physical address, for // a data access (D)
Bit32u BX_CPU_C::dtranslate_linear(Bit32u laddress, unsigned pl, unsigned rw) { <snip long code which uses cr0, cr3 but not cr4 (yet)> }
// Translate a linear address to a physical address, for // an instruction fetch access (I)
Bit32u BX_CPU_C::itranslate_linear(Bit32u laddress, unsigned pl) { <snip long code which uses cr0, cr3 but not cr4 (yet)> }
====> these functions will have to additionally take into ====> account the value in cr4 (e.g. PSE bit), so that ====> 4 MB pages are supported. ---------- cut here ----------- cut here ------------- cut here --------------
Thanks,
-Farid.