Re: Question on "mappings as cache"

8 Dec 2003

      On Mon, 2003-12-08 at 09:30, Espen Skoglund wrote:
...
[Jonathan S Shapiro]
...
...
A maps some region to B
   B completes the receive operation, and therefore
     now has a copy of the mapping
   B is immediately preempted, before it can do any user-level
     book keeping about the mapping
   ... other stuff runs ...
   kernel runs out of mapping cache space, chooses to evict
     the mapping just received by B
   ... other stuff runs ...
   B attempts to reference the region that it believes should
     be mapped, and page faults.
...
Can someone explain the process by which B is able to get the
mapping reconstructed?
A really quick answer:
B's pager, Pb, receives the page fault
   Pb requests the mapping from A
Note that Pb and A could here be the same thread...
This makes sense to me, but it also seems to me that if A is a process
implementing the file server, and B has memory mapped a file from A,
then the current design requires Pb to act as an intermediary --
primarily for the purpose of normalizing file offsets and doing a little
bit of protocol translation.

Further, it seems to me that there is an interesting problem of
deceiting here, since the file server may not know that Pb and B are
equivalent for access control purposes.

Am I missing something that simplifies this scenario?
...
A longer answer would require a better understanding of our concept of
"data spaces", "data space managers", and "region maps" [1].  Here's a
rather shortish explanation of this scheme:
Data space: An unstructured data container, e.g., a file, anonymous
      memory, pinned memory, etc.
Data space manager: A server that manages accesses to a particular
      data space.  The data space manager will typically have parts
      (or the whole) of the data space mapped into its own address
      space.  It will map these parts off to clients.
Region map: A region map is a part of the client's address space
      that contains parts (or the whole) of a data space.  Note that
      the region map need not be fully populated.  If the client
      accesses a part of the region which is not mapped, a page fault
      will be generated.
Region mapper: The region mapper serves as the page fault handler
      for the threads within the client.  The region mapper keeps
      track of all region maps attached to the address space.  When
      the region mapper catches page faults it translated these page
      faults into requests that are forwarded to the respective data
      space manager.
Okay. This is roughly the model that I was reconstructing from first
principles. I will try to use these terms from here on to avoid
confusion.

Based on your description, I am now reasonably convinced that the L4
operations are individually faster, but that the collective end to end
protocol needed to resolve page faults when data spaces are involved may
be significantly more complicated in L4 than it is in EROS. I suspect
that the aggregate end to end costs in L4 are likely to be *slower* than
EROS, but at best they are going to be very similar.
...
For B to access parts of the data space, the following
steps would typically be taken (Rm = region mapper, Dm = data space
manager):
1. Rm: Create region (R)
   2. Rm: Request data space manager (Dm) to attach a data space (D)
      to R.
   3. B: Touch some memory in R.  Nothing is mapped yet and a page
      fault is therefore raised.
   4. Rm: Receive page fault and use virtual address to identify
      region.
   5. Rm: Request Dm to map parts of the data space to R.
   6. Dm: Map parts of D to R.
An obvious optimization here is for Rm to request parts of the region
map to be pre-populated before step 3.
A better optimization might be to provide sufficient information to the
kernel so that it can more directly localize the correct fault handler.

Or perhaps the L4 design embeds a philosophical argument that resolving
these things at user level is (a) feasible and (b) likely as efficient
than any kernel implementation, and therefore should not be done in the
kernel? If so, I understand the philosophical point, and I am not sure
that I agree. In my mind, the answer depends on what gets the job done
best on an end to end basis.

Please note that I'm not advocating placing policy in the kernel here.
I'm wondering if there might be a better *mechanism* by which to express
the user-desired policy.
...
[Hmm... my "short" answer turned out to be a bit longer than
expected.]
Perhaps so, but it was VERY helpful!

shap

Re: Question on "mappings as cache"

Jonathan S. Shapiro