Sawmill's dataspaces and the Hurd's physmem

4 Sep 2005

      During my presentation of the design of the Hurd on L4 to the Dresden
group, Lars Reuther asked me if I had considered Sawmill's dataspace
model and if so why I had rejected it.  My answer was that we want to
be able to catch any errors at the time of data acquisition and that
relying on a file system server to provide a mapping to the data does
not guarantee this: a malicious server can unmap mappings and refuse
to remap them; or if the server is shorter lived than the client, the
the data becomes inaccessible to the client when the server exits.

Lars asked for more details.  Unfortunately, I failed to provide a
coherent and complete argument: my problem was mainly that I came to
this conclusion fairly early in the design process and had since
forgotten too many details of Sawmill's architecture to reconstruct my
argumentation.  I've recently revisited some related issues and am now
in a position to better answer Lars's question.  My primary reference
for the Sawmill framework is [1].

I think I may be able to best explain the problem with an
illustration: consider a server providing access to a file system
backed by a disk.

In the Sawmill dataspace model [1], the file system server is a
dataspace manager which likely provides a dataspace for each file.
When a task wants to use a file, it first identifies the dataspace
associated with the file (e.g. gets a capability to the file from the
DM) and attaches it to its address space (e.g. tells its pager to
associate a portion of the VM with the capability).  "After an attach,
the region mapping forwards all page fault requests in that region to
the dataspace manager.  The dataspace manager resolves the faults with
map or grant operations."

I understand this to mean that a client depends on a DM to:

 - provide mappings to data
 - provide resources backing the data

When a client requests some data from a dataspace, the DM provides a
mapping to the client.  The client can proceed to use the data,
however, at any point, the server could cause the mapping to be
unmapped and possibly render the data inaccessible.  The implication
is that the client must either trust the server to always provide the
mapping or be prepared to recover should the data disappear.  The
latter approach can be simplified by making a physical copy of data
before committing to using it (which can be done by interposing a
second DM between the DM and the client).  General use of this tactic
means a lot of cycles will be spent copying bytes and a reduction in
the amount of physical memory sharing in the system.

DMs appear to use their own resources to fetch and store data as well
as to hold data (neither [1] nor [2] mentions any mechanism for the
client to specify to the DM what memory/data space to read data into).
I assume that the normal mode of operation is that a file system DM
has a certain amount of physical memory available from a "physical
memory" DM which it uses to hold data from backing store.  Once that
memory is exhausted, it must choose some page to evict.  There are
several problems with this model: because DMs allocate resources on
behalf of clients resources are allocated with the priority of the DM
and resource accounting is extremely difficult.  We know this from our
experience with the Hurd on Mach.  Moreover, the DM controls the
paging policy; not the clients who are actively using the memory.  To
control the availability of memory, it would seem that a client would
again have to copy the data.

                    [ Physmem DM ]
                          |
                          v
                      [ FS DM ]
                          |
                          v
                      [ client ]

The framework that I have developed for the Hurd avoids these
dependencies on the file system servers.  The physical memory manager
("physmem") is part of the TCB.  physmem provides capabilities to
so-called containers which identify memory reservations (either
specific, e.g. a specific set of frames, or general, e.g. a specific
number of frames).  Given a container capability, the holder can map
the contents or logically copy the contains to or from a second
container.

When a task wants to read data from a file, it passes a container
capability to the file system server.  The file system server stores
the data in the container (if it already read it into memory then it
can logically copy it).  Then it returns to the client.

If the task wants a mapping to the data, it requests one from physmem.
Thus, tasks on the Hurd do not depend on file system server to provide
mappings.  After the read is complete, the task knows that the data is
either available or not available.  If the file system server exits,
this does not affect the data that the client has.

                      [ physmem ]
                       /       \
                     |_         _|
                [ client ]     [ FS ]

Because the client passes containers to the server, the server does
not allocate memory to store the data on behalf of the client.  (The
server may need other resources such as CPU time, I/O bandwidth and
state, however, we have other provisions for that.)  Thus memory
allocations are directly attributed to the client which is vital to
the correct functioning of accounting.  Also, when the client has
exhausted its memory quota, it must free some memory before it can
allocate a container with the required reserve.  Thus, clients fully
control their own paging policy.

One of the goals of the Hurd framework is to minimize the number of
dependencies that a client has on the behavior of servers.  This is
important to us because clients often interact with other users's
servers which may be malicious.  Another goal is to more directly link
consumers of resources with resource allocators.  Our observation is
that many applications are hurt due to policies such as the eviction
scheme imposed by the OS.  Moreover, applications such as garbage
collectors and multimedia applications could benefit from knowing how
much real resource is actually available to them.  (Applications
should work in harmony with the mechanisms provided to them and the
policies imposed on them; they should not have to work around them.)

Thanks.  I have tried to be as concise as possible which means I may
have missed some details.  I am particularly interested in the
thoughts of the Sawmill and DROPS developpers.

Neal

[1] http://l4ka.org/publications/2001/sawmill-framework.pdf
[2] http://os.inf.tu-dresden.de/l4env/doc/l4env-concept/l4env.pdf

Sawmill's dataspaces and the Hurd's physmem

Neal H. Walfield