During my presentation of the design of the Hurd on L4 to the Dresden group, Lars Reuther asked me if I had considered Sawmill's dataspace model and if so why I had rejected it. My answer was that we want to be able to catch any errors at the time of data acquisition and that relying on a file system server to provide a mapping to the data does not guarantee this: a malicious server can unmap mappings and refuse to remap them; or if the server is shorter lived than the client, the the data becomes inaccessible to the client when the server exits.
Lars asked for more details. Unfortunately, I failed to provide a coherent and complete argument: my problem was mainly that I came to this conclusion fairly early in the design process and had since forgotten too many details of Sawmill's architecture to reconstruct my argumentation. I've recently revisited some related issues and am now in a position to better answer Lars's question. My primary reference for the Sawmill framework is [1].
I think I may be able to best explain the problem with an illustration: consider a server providing access to a file system backed by a disk.
In the Sawmill dataspace model [1], the file system server is a dataspace manager which likely provides a dataspace for each file. When a task wants to use a file, it first identifies the dataspace associated with the file (e.g. gets a capability to the file from the DM) and attaches it to its address space (e.g. tells its pager to associate a portion of the VM with the capability). "After an attach, the region mapping forwards all page fault requests in that region to the dataspace manager. The dataspace manager resolves the faults with map or grant operations."
I understand this to mean that a client depends on a DM to:
- provide mappings to data - provide resources backing the data
When a client requests some data from a dataspace, the DM provides a mapping to the client. The client can proceed to use the data, however, at any point, the server could cause the mapping to be unmapped and possibly render the data inaccessible. The implication is that the client must either trust the server to always provide the mapping or be prepared to recover should the data disappear. The latter approach can be simplified by making a physical copy of data before committing to using it (which can be done by interposing a second DM between the DM and the client). General use of this tactic means a lot of cycles will be spent copying bytes and a reduction in the amount of physical memory sharing in the system.
DMs appear to use their own resources to fetch and store data as well as to hold data (neither [1] nor [2] mentions any mechanism for the client to specify to the DM what memory/data space to read data into). I assume that the normal mode of operation is that a file system DM has a certain amount of physical memory available from a "physical memory" DM which it uses to hold data from backing store. Once that memory is exhausted, it must choose some page to evict. There are several problems with this model: because DMs allocate resources on behalf of clients resources are allocated with the priority of the DM and resource accounting is extremely difficult. We know this from our experience with the Hurd on Mach. Moreover, the DM controls the paging policy; not the clients who are actively using the memory. To control the availability of memory, it would seem that a client would again have to copy the data.
[ Physmem DM ] | v [ FS DM ] | v [ client ]
The framework that I have developed for the Hurd avoids these dependencies on the file system servers. The physical memory manager ("physmem") is part of the TCB. physmem provides capabilities to so-called containers which identify memory reservations (either specific, e.g. a specific set of frames, or general, e.g. a specific number of frames). Given a container capability, the holder can map the contents or logically copy the contains to or from a second container.
When a task wants to read data from a file, it passes a container capability to the file system server. The file system server stores the data in the container (if it already read it into memory then it can logically copy it). Then it returns to the client.
If the task wants a mapping to the data, it requests one from physmem. Thus, tasks on the Hurd do not depend on file system server to provide mappings. After the read is complete, the task knows that the data is either available or not available. If the file system server exits, this does not affect the data that the client has.
[ physmem ] / \ |_ _| [ client ] [ FS ]
Because the client passes containers to the server, the server does not allocate memory to store the data on behalf of the client. (The server may need other resources such as CPU time, I/O bandwidth and state, however, we have other provisions for that.) Thus memory allocations are directly attributed to the client which is vital to the correct functioning of accounting. Also, when the client has exhausted its memory quota, it must free some memory before it can allocate a container with the required reserve. Thus, clients fully control their own paging policy.
One of the goals of the Hurd framework is to minimize the number of dependencies that a client has on the behavior of servers. This is important to us because clients often interact with other users's servers which may be malicious. Another goal is to more directly link consumers of resources with resource allocators. Our observation is that many applications are hurt due to policies such as the eviction scheme imposed by the OS. Moreover, applications such as garbage collectors and multimedia applications could benefit from knowing how much real resource is actually available to them. (Applications should work in harmony with the mechanisms provided to them and the policies imposed on them; they should not have to work around them.)
Thanks. I have tried to be as concise as possible which means I may have missed some details. I am particularly interested in the thoughts of the Sawmill and DROPS developpers.
Neal
[1] http://l4ka.org/publications/2001/sawmill-framework.pdf [2] http://os.inf.tu-dresden.de/l4env/doc/l4env-concept/l4env.pdf