Re: UX/RT - a Universally eXtensible Real-Time OS

29 Jan 2015

      More about UX/RT:

The most distinctive feature of UX/RT will be its nearly pure file-oriented
architecture. Under UX/RT, the only primitive APIs that won't operate on
files will be a few process-related ones like fork() and the threading API (of
course, direct access to L4 message passing will also be available, but no OS
services will use it outside the filesystem); even process memory like stack and
heap that is anonymous on other OSes will be in mapped memory-resident files on
UX/RT. The environment provided to applications will be a reasonably
standards-compliant Unix/POSIX environment with a few extensions (nearly all
the non-file-based "system calls" will be implemented as library functions that
access special files).

There will be no attempt to keep the root task "personality-neutral" (it will
implement a subset of the Unix API directly) and little support for coexisting
with other environments (there may be some limited support for running L4Linux
to support Linux-only drivers, although at least on PCs a lightweight hostless
Xen setup will be installed by default, and this will allow running Linux
driver domains on Xen so UX/RT probably won't have to go out of its way to be
able to support coexisting with L4Linux). There will however be support for
multiple "universes" that implement different variants of Unix-like environments
(kind of like some Unices of the late 80s and early 90s), although this will be
implemented using generic per-process binding of directories instead
of any special
support in the root task. The default universe will be partially based on BSD
commands and libraries (although some duplication of functionality in commands
will be removed and some commands will be split up). There will be a universe
that provides an environment that is binary compatible with Linux (which will be
easy enough to implement since UX/RT binaries will be built almost identically
to Linux ones, meaning that dynamically-linked binaries could be run by
providing a VDSO replacement that uses the UX/RT API, as opposed to trapping
Linux system calls). This would further reduce the need for co-hosting with
L4Linux. Also provided will be a universe that uses native libraries, but with
the default BSD-ish UX/RT base commands replaced with GNU ones.

The root task will contain the process manager and VFS, as well as a few
built-in special filesystems (it will be similar in scope to QNX's process
manager, except that it will be a separate program, rather than colocated in
the microkernel, of course). The VFS will be somewhat simpler than that of a
typical Unix, and will function mostly as a name service and memory mapping
layer (it will have no concept of devices at all, and filesystems will be
mounted from "ports", which will be special files used as communication channels
between file servers and the root task, much like the ones in Plan 9; this
means that UX/RT will not have device numbers, on-disk device nodes, or
Hurd-style translators). File descriptors will be mapped directly onto sets of
L4 capabilities and IPC gates, and read() and write() will bypass the VFS layer
entirely in most cases, and will call L4 IPC APIs to interact directly with the
target server (or client) process. read() and write() will always preserve
message boundaries and will never combine or split calls (similarly to
SOCK_SEQPACKET sockets on Linux), although some servers (like the disk
filesystems and network stack) will rearrange or ignore message boundaries
internally. Basically, as far as the VFS layer is concerned, a file will be a
stream of packets rather than a stream of bytes that can be broken up in
arbitrary places.

The other file system API functions (including mmap()) will call the root task.
There will be a tmpfs-like virtual memory filesystem built into the root task,
which will be used for both shared memory and for process-private memory
segments like the heap and stack (each process will get its own in-memory
filesystem in its /proc directory). Using mmap() on a file in an in-memory
filesystem with MAP_SHARED will map that file's pages directly into the
process's address space; mapping in-memory files with MAP_PRIVATE will create a
copy-on-write shadow file in the process's memory filesystem. Mapping
files that aren't in an in-memory filesystem will also create a shadow file that
is similar to the ones for private mappings of in-memory files, except that the
root task will use normal reads and writes to access the backing.

All of the low-level system call stuff will be in a library separate from libc,
which will translate basic Unix API functions into L4 messages. This library
will contain only low-level functions (read(), write(), and those implemented by
the root task and microkernel). The root task will use a permanent anonymous
file descriptor for APIs that don't operate on file descriptors. The default
libc will be forked from one of the BSDs (probably NetBSD); however, the Linux
compatibility environment will use glibc.

Similarly to Plan 9, there will be no ioctl() primitive that is implemented by
file servers, and out-of-band messages will be implemented with normal reads and
writes on separate files. However, there will be a pure library implementation
of ioctl() that will support some common types of device files, like terminals
(it will just use normal reads and writes; there will also be an ioctl_read()
function for servers that reads an ioctl request from an FD and splits it into
request type and payload and an ioctl_reply() function to send a reply).

Security will be implemented by having a per-process list of "file capabilities"
that specify which files the process can access and what kind of access is
permitted (these "capabilities" won't get translated into kernel capabilities
until the file gets open()ed). It will be possible for a file capability to
apply to all files in a particular directory. It will also be possible for file
capabilities to use the normal Unix permission bits in the filesystem for a
particular file or directory rather than explicitly specifying permissions.
Checking of file capabilities and translation of them into kernel capabilities
will be handled by the VFS layer in the root task, and manipulation of them will
be done through special files in /proc (meaning it will be possible to control
which processes can manipulate each other's file capabilities). There will be
no privileged system calls like virtually all other Unix-like OSes; all
privileged APIs will be implemented using special files.

I am planning to use a different boot process than that of L4Re or Genode.
It will be similar to QNX's boot process, but more flexible. All OS components
and configuration files required for booting, including the kernel, sigma0, and
proc (the root task) will be contained in a ROM filesystem image. There will be
an in-memory bootloader that reads a list of OS components from the image (the
image will contain some other files besides the ones in this list; only the
files required to get to the point where the image can actually be mounted will
be included in the list), sets up a modified version of a Multiboot table with
module addresses pointing at the modules in the boot image (this way, neither
the kernel nor the root task will have to be able to understand the filesystem
of the boot image), and sets up some initial page tables that map everything at
the correct virtual addresses. Read-only sections will be mapped in place, and
only writeable sections will ever be copied (although for the kernel, sigma0,
and the root task, which aren't going to be reused, it might make sense to map
everything in place except for the bss sections). I am planning to move the
option parsing, KIP initialization, and all the other L4-specific boot code into
the kernel so I can just use my own bootloader to boot it (my bootloader is
intended to be generic, although its variant of Multiboot has several
enhancements over the GRUB version and is designed for extensibility). Once the
root task is running, it will create a temporary filesystem from the Multiboot
module list and will run the init process (it will be like a cut-down System V
init). The init process will call the mount command to mount the real image
filesystem (init, mount, and the shared libraries they depend on will be
included in the boot module list), and after the image filesystem is mounted it
will run a set of more advanced service management daemons that manage the rest
of the boot process (probing for hardware, starting the required servers,
mounting the real disk- or network-based root filesystem, etc.).

The disk filesystem support and network stack will be based on the rump kernel
from NetBSD with some UX/RT-specific glue code added on top. For performance
reasons the disk drivers and disk filesystem drivers will run in one process
(there will probably be one process per host adapter); the network stack will
probably also run in a single process.

Initially I am just going to use Xorg (with a UX/RT-specific device file
transport) as the window system, but I will replace that with a lightweight
compositing window server, which will probably be partially based on
Weston or one of the other Wayland compositors, but with a completely
different protocol (which
will use a server-allocated special file for each surface rather than
client-allocated anonymous memory) and with any server-side window management
ripped out (my window system will use a conventional reparenting window manager
like X does; the window server will only handle event delivery and
low-level graphics operations). Of course X will still be available
even after I have my own window server working (there will be a
rootless X server available). The default desktop environment will be
GNUstep-based.

What does everyone here think of my design? Would anyone here be interested in
contributing?

On 1/28/15, Nils Asmussen <nils@os.inf.tu-dresden.de> wrote:
...
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256
Hi,
as you want to build a UNIX-like microkernel-based OS, it might be
interesting for you to look at my OS, Escape [1]. It is some mixture
of UNIX, Plan 9 and L4, I would say. It is not directly related
though, but completely written from scratch and also UNIX-like at its
heart in contrast to emulating that on top. However, since it does not
consider real-time, you probably don't want to use it, but perhaps
it's interesting to see how it works.
I downloaded an older version of your OS a while back. Didn't really
look all that closely at the source though. I'll have to look at it a
bit closer, although its architecture seems to be rather different
from how UX/RT will be.

Re: UX/RT - a Universally eXtensible Real-Time OS

Andrew Warkentin