Merging sigma0 and roottask

14 Mar 2013

      Hello Fiasco.OC developers,

hereby, I'd like to propose a slight architectural change regarding the
roles of sigma0 and roottask in Fiasco.OC-based systems.

During our work with using the kernel for Genode, we repeatedly
encountered problems that were somehow related to sigma0, in particular
the priority inversion problem we reported last year
(http://os.inf.tu-dresden.de/pipermail/l4-hackers/2012/005348.html) and
a recent issue related to inconsistencies of caching attributes on ARM,
which led to subtle memory corruptions. Both problems were pretty hard
to debug and took us a lot of time.

Knowing about those issues, there are of course ways to deal with them.
The priority inversion problem could be solved by assigning the highest
priority to sigma0. The cache attribute problem can principally be dealt
with by managing the cache flushing manually and making sure not to
touch the wrong cache lines. But this remains to be a mine field.
Because roottask has all memory mapped, dangling pointers may go
unnoticed, yet produce unwanted caching effects. In our experience,
these kinds of problems remain largely invisible until the system gets
highly dynamic (e.g., if the role of RAM for DMA buffers or normal
memory changes at runtime). But once they occur, they become a nuisance.

Without sigma0, our life would have been easier. With no sigma0 thread,
there wouldn't have been a priority inversion problem. And without the
sigma0 protocol that is unaware of caching attributes, we could easily
maintain the consistent use of those attributes among all processes
including roottask.

Besides hitting the issues mentioned above, we found that the use of
sigma0 implies two further problems. First, because roottask can hand
out memory not before obtaining it from sigma0, all physical memory must
be mapped within roottask. So the virtual memory of roottask limits the
amount of physical memory usable in the system. And second, because
roottask must maintain all those mappings with maximum privileges, a bug
in roottask can silently corrupt arbitrary memory.

Motivated by these observations, I conducted the experiment to remove
sigma0 from the picture and see where this would lead us.

Kernel changes
--------------

The (preliminary) patch of the kernel and bootstrap is actually pretty
small:

https://github.com/nfeske/foc/commit/7599e863c2feb07a34b891499982f4ffb58ff3e...

I kept the term "sigma0" in place to keep the patch simple. In the
following, the terms "sigma0" and "roottask" always refer to the first
user-land process started by the kernel.

Originally, sigma0 was paged with one-to-one mappings by the kernel and
would use the normal map operation with a sigma0-virtual address as
source of the mapping. Here, the new solution differs in that all memory
mappings originating from sigma0 are now directly coming from the
physical address space. This requires one kernel-internal interface
change concerning the 'Mem_space::v_fabricate_map' function.

This function is used in two situations, map and unmap. When mapping, it
is used to determine the physical frame for the virtual address
specified as source for the mapping. For the new version of sigma0, this
makes no sense because the source address does not refer to sigma0's
virtual address space. When unmapping, however, this function is used to
look up the physical frame for the virtual page to unmap. In this case,
the argument refers to an actual sigma0-virtual address. Consequently,
the function cannot accommodate both use cases. Therefore, I introduced
a new 'v_fabricate_map_src' function that accompanies the 'v_fabricate'
function. For all processes other than sigma0, both functions are doing
the same thing. But for sigma0, the 'map_src' function interprets the
address argument as a physical address. Because this function is used to
determine the mapping source, I have used the suffix "_map_src".

The second noteworthy change is the distinction between sigma0 threads
with a pager and those without a pager. The original version of sigma0
had no notion of a pager. There was only a single thread, paged by the
kernel. Now, if roottask is sigma0, there are several threads. Most of
them can be paged by a local pager in roottask. To distinguish both
cases, I needed to introduce an 'is_null' accessor.function to the
'Context_ptr' class.

User-land changes
-----------------

The implications to Genode's version of roottask (called core) are more
substantial but in very positive ways:

The initialization of core's allocators used to required an interplay
between core and sigma0. Because there is no longer a need to have all
memory mapped in core, we can simply drop this whole procedure and just
use the memory descriptors provided by the KIP.

After an initialization phase where core faults-in its own image and the
KIP via the kernel, core drops its privileges by assigning a core-local
pager to all core threads. So any invalid access gets detected right
away. Browsing through the page table of core using the kernel debugger
is like visiting a desert. In contrast to the original version, it has
become easy to maintain an overview of the mappings within core.

The revocation of memory mappings used to rely on the in-kernel mapping
database. This won't work for core anymore because core does not
maintain mappings for the memory handed out to other processes. Instead,
core uses 'l4_task_unmap' to flush mappings from non-core processes as
needed. This is similar to how Genode works on OKL4. Still, non-core
processes may create further mappings, which are captured by the
in-kernel mapping database.

Current state and open questions
--------------------------------

With the current state of the implementation, the complete software
stack of Genode runs without sigma0. This includes L4Linux. So I am
pretty confident that the removal of sigma0 from the system does not
imply functional disadvantages.

That said, there are a few remaining questions that I'd like to discuss.

First, the sole use of 'l4_task_unmap' to remotely flush memory mappings
in other processes means that we must no longer provide the option of
granting memory mappings. Otherwise, a process that received a mapping
from roottask could "steal" the physical memory by granting it to
someone else. Roottask would not know about that, and an attempt to
flush the mappings in the original receiver of the mapping would just
target a hole in the address space. My question is:

  "Can we live without granting memory?"

  or the other way: "What is a known use case for granting memory?"

Second, the new situation triggers some code paths in the kernel that
were not used before. Apparently, the unmapping of memory from sigma0
was not considered. This is where I hit a few assertions in the kernel.
Right now, I have just worked around these assertions by uncommenting
the offending code in 'kernel/fiasco/src/kern/map_util.cpp'. I
understand that this is just a stop-gap solution. Would you like to lend
a helping hand to find out...

  "How to support unmap from sigma0 in a clean way?"

I would like to avoid Genode from diverging too much from the semantics
of the official Fiasco.OC kernel. Hence, I would appreciate your
consideration:

  "Would you like to follow a similar path with L4Re?"

That would be very nice. But even if this should not be the case, I
would find your rationale behind sticking with sigma0 very valuable to
know, e.g., for reconsidering my plan.

Regards
Norman

-- 
Dr.-Ing. Norman Feske
Genode Labs

http://www.genode-labs.com · http://genode.org

Genode Labs GmbH · Amtsgericht Dresden · HRB 28424 · Sitz Dresden
Geschäftsführer: Dr.-Ing. Norman Feske, Christian Helmuth