Just a few clarifications, at least I hope I clarify things ;)
On Mon, 7 Dec 2003, Jonathan S. Shapiro wrote:
So:
The key differences between the L4 mapping state and the EROS-NG (next generation) mapping state may be described as follows:
- In L4, the only important state is the state recorded in the mapping database. This state is a cache, and applications are required to be able to reconstruct their own mapping state on demand.
The mapping database is a cache only in the sence that at any time a higher level pager can revoke a mapping from an application or lower-level papers. In the current model the kernel 'never' throws away mappings.
Page fault handlers can be injected at any arbitrary GPT.
There seem to be advantages and disadvantages to each design.
- Cost of mapping:
I believe the the correct way to measure the cost of a map operation is to include all of the costs necessary to actually get a valid PTE into the recipient address space.
- L4 Map
The dominant cost in the L4 map operation is the cost to build the necessary mapping DB entries in the kernel. PTEs are copied aggressively, so there are usually no further hardware page faults needed to load them when the recipient starts running.
In addition to the kernel-level map operation, the recipient task must record the incoming mapping in some per-task database. This database essentially duplicates the state in the kernel, though my guess is that it can be accomplished via a region-based recording scheme in the usual cases (that is: I record that X bytes from File Y got mapped starting at address A, and that faults in this region should be recovered by making a request to thread-id T). It is not clear to me what the practical overhead of this additional tracking is.
The mapping database is not per-task. There is a single mapping database in the kernel and it consists of a mapping tree per physical frame of memory. In fact, in some of the implementations the mapping database nodes are all stored in page tables leaf nodes.
Unless something has changed in a way that I have failed to understand, L4 does not provide a means to share page tables across more than one task.
L4 does not explicitly export any concept of sharing page table sub-tree's but you can build this knowledge from the map/grant/unmap primitives. There have been proposals floated to make sharing explicit (ie the Link operation) and via mapping hints.
Random Comments:
It appears to me that there are peculiar boundary conditions implicit in the L4 design: if the very last page of a process is paged out, I am not sure how its pager thread runs, because I do not understand what memory that pager thread references in order to initiate instructions or store temporary data. I suspect that the solution is that the pager thread in turn specifies a pager thread (which I will call the meta-pager), and the meta-pager arranges to page enough state back in that the pager thread can make progress.
The paper thread need not be in the same address space as the thread it is paging so as you point out if all threads in an address-space (bar one) use an address-space local thread as their pager, then you could speficy an address-space external thread as that thread's pager.
Probably this all seems perfectly obvious to people who are familiar with L4. I find it confusing that every process must have two threads or must delegate mapping management to a third-party task.
Pagers are not specified per address space. They are specified per thread. No third party task would be required if by some estabilished protocol, the minimal set of pages an address-space-local pager needed were pinned in memory.
Cheers, Adam
-- Adam "WeirdArms" Wiggins School of Computer Sci. & Eng. PhD Student The University of NSW Phone: +61 2 9385 7359 UNSW SYDNEY NSW 2052, Australia Fax: +61 2 9385 7942 http://www.cse.unsw.edu.au/~awiggins