L4Re Operating System Framework
Interface and Usage Documentation
All Data Structures Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Modules Pages
Kernel ABI

This section details the binary representation of the IPC interface of the kernel.

It accompanies the L4 Inter-Process Communication (IPC) section. The details presented here are usually not relevant when developing L4Re applications and can therefore be skipped by many readers.

Note
The kernel ABI is subject to change. Please use the API instead of relying on particular binary representations.

The following notation is used to indicate how particular data fields are used:

  • [in]: The kernel reads and interprets this field.
  • [out]: The kernel writes this field with information provided by the kernel.
  • [cpy]: The kernel copies this field from sender to receiver (without interpretation if [in] is not listed as well).

The above indications may be combined.

Capability selector and flags

See partner capability selector and IPC flags.

The kernel reads and interprets all the fields ([in]).

 MSB      12    11     10 [7]  4     3         2         1         0      bits
┌───────────┬─────────┬─────────┬─────────┬─────────┬─────────┬─────────┐
│   [in]    │  [in]   │         │  [in]   │  [in]   │  [in]   │  [in]   │
│(see below)│ special │   SBZ   │  reply  │open wait│  recv   │  send   │
└───────────┴─────────┴─────────┴─────────┴─────────┴─────────┴─────────┘
┌───────────┬─────────┬─
│  cap idx  │    0    │  if special is 0
└───────────┴─────────┴─
┌───────────┬─────────┬─
│   1...1   │    1    │  if special is 1
└───────────┴─────────┴─

Label

See IPC label.

When IPC is sent via a thread capability, the label is copied to the receiver unchanged ([cpy]).

When IPC is sent via an IPC gate, the sent label is ignored and the kernel provides the bitwise OR (|) of the IPC gate label and the sender’s write and special permissions (see L4_CAP_FPAGE_W and L4_CAP_FPAGE_S) of the used capability ([out]):

 MSB               2         1                 0          bits
┌───────────────────┬─────────────────┬─────────────────┐
│       [out]       │      [out]      │      [out]      │
│       label       │ label | special │  label | write  │
│                   │          right  │          right  │
└───────────────────┴─────────────────┴─────────────────┘

Message tag

See IPC message tag. Note that, for a message tag returned by the kernel, if the error flag is set, all other contents of the message tag is undefined.

 MSB            16     15         14         13         12     11   [6]    6 5    [6]    0  bits
┌─────────────────┬──────────┬──────────┬──────────┬──────────┬─────────────┬─────────────┐
│      [cpy]      │  [out]   │          │ [in,cpy] │ [in,cpy] │  [in,cpy]   │  [in,cpy]   │
│     payload     │  error   │   SBZ    │ schedule │ transfer │    items    │    words    │
│                 │   flag   │          │   flag   │ FPU flag │             │             │
└─────────────────┴──────────┴──────────┴──────────┴──────────┴─────────────┴─────────────┘

Timeouts

See IPC timeouts and l4_timeout_t.

The kernel reads and interprets all the fields ([in]).

 31     [16]    16 15     [16]     0  bits
┌─────────────────┬─────────────────┐
│       [in]      │       [in]      │
│  send timeout   │ receive timeout │
└─────────────────┴─────────────────┘

A timeout has the following format. There are two special timeout values:

  • Zero timeout: Only bit 10 is set. See L4_IPC_TIMEOUT_0.
     15  [5]  11  10   9        [10]        0  bits
    ┌───────────┬─────┬──────────────────────┐
    │     0     │  1  │          0           │
    └───────────┴─────┴──────────────────────┘
    
  • Infinite timeout: All bits are unset. See L4_IPC_TIMEOUT_NEVER.
     15                [16]                 0  bits
    ┌────────────────────────────────────────┐
    │                   0                    │
    └────────────────────────────────────────┘
    
    Otherwise, the timeout is either relative or absolute, which is specified by bit 15.
  • Relative timeout: If bit 15 is unset, the timeout is mantissa * 2 ^ exponent micro seconds relative to the current time. The mantissa must not be zero:
      15   14  [5]  10 9        [10]        0  bits
    ┌─────┬───────────┬──────────────────────┐
    │  0  │ exponent  │     mantissa ≠ 0     │
    └─────┴───────────┴──────────────────────┘
    
  • Absolute timeout: If bit 15 is set, an absolute timeout is specified in the UTCB’s buffer registers starting at buf reg idx (the particular number of registers depends on the architecture; see l4_timeout_s):
      15   14       [9]       6 5    [6]    0  bits
    ┌─────┬────────────────────┬─────────────┐
    │  1  │         SBZ        │ buf reg idx │
    └─────┴────────────────────┴─────────────┘
    

User-level thread control block (UTCB)

See User-level thread control block (UTCB).

     l4_utcb_mr()                        l4_utcb_br()                    l4_utcb_tcr()
    l4_msg_regs_t                       l4_buf_regs_t                   l4_thread_regs_t
 ┌─────────────────┐ ┌────────┐ ┌───────────────────────────┐ ┌───────────────────────────────────┐
 0      [63]      62     63         64     65     [58]    122    123        124     125   [3]   128  words
┌───────────────────┬──────────┬──────────┬──────────────────┬──────────┬──────────┬───────────────┐
│                   │ [(out)]  │   [in]   │       [in]       │  [out]   │ [(out)]  │               │
│ message registers │   arch   │   BDR    │ buffer registers │  error   │   free   │ thread‐local  │
│       (MRs)       │ specific │          │      (BRs)       │   code   │  marker  │ storage (TLS) │
└───────────────────┴──────────┴──────────┴──────────────────┴──────────┴──────────┴───────────────┘
│                   └────────────────────┐
│0   [words]       [2 * items]         62│ words
┌───────────────┬───────────────┬────────┐
│     [cpy]     │   [in,out]    │        │
│   (untyped)   │    typed      │ unused │
│ message words │ message items │        │
└───────────────┴───────────────┴────────┘

Buffer descriptor register

See IPC Buffer Descriptor Register and l4_utcb_br()->bdr.

 MSB      25     24     23  [9]  15 14    [5]     10 9     [5]      5 4     [5]      0  bits
┌───────────┬──────────┬───────────┬────────────────┬────────────────┬────────────────┐
│    SBZ    │ inherit  │    SBZ    │ index of first │ index of first │ index of first │
│           │ FPU flag │           │ obj cap buffer │   io buffer    │ memory buffer  │
└───────────┴──────────┴───────────┴────────────────┴────────────────┴────────────────┘

Typed message items

The number of words in a typed message item varies depending on the particular kind of item. However, for the first word, the following properties are shared:

  • Void item: If all bits of the first word of a typed message item are zero, then it is a void item.
  • Non-void item: The first word of a non-void typed message item has the following binary layout:
     MSB                        4  3  2      0  bits
    ┌────────────────────────────┬───┬────────┐
    │                            │ t │        │
    └────────────────────────────┴───┴────────┘
    
    Bit 3 (t) is the type bit. If t is set, the item is a map item. Currently, map item is the only supported type. Hence, this bit must be set for all items except for void items.

There are three sub-types of typed message items: send items, receive items, and return items; see Message Items.

Many typed items make use of flexpages, therefore, these are described before the various kinds of typed items. Note that flexpages are also used outside of typed message items, e.g., for L4::Task::unmap().

Flexpages

A flexpage consists of a single word and, except for some special values, describes a range in an address space, see flex pages.

The general layout is defined as follows:

 MSB                               6 5  [2]  4 3    [4]    0  bits
┌───────────────────────────────────┬─────────┬─────────────┐
│                                   │   type  │             │
└───────────────────────────────────┴─────────┴─────────────┘

The type L4_FPAGE_SPECIAL only supports some selected values, which are only supported for selected interfaces; see L4_FPAGE_SPECIAL.

The other types share the same layout:

 MSB            12 11     [6]      6 5  [2]  4 3    [4]    0  bits
┌─────────────────┬─────────────────┬─────────┬─────────────┐
│      start      │      order      │   type  │ permissions │
└─────────────────┴─────────────────┴─────────┴─────────────┘

Also see l4_fpage() (memory space), l4_iofpage() (I/O port space) and l4_fpage_obj() (object space).

Send items

A send item consists of two words. The second word of a non-void send item is a flexpage. The type of the flexpage determines the interpretation of the attr bits in the first word (see below).

If not void, the layout of the first word is defined as follows:

                      first word                         second word
 MSB     12 11  8 7    4  3    2      1        0                        bits
┌──────────┬─────┬──────┬───┬─────┬───────┬──────────┐┌───────────────┐
│ hot_spot │ SBZ │ attr │ 1 │ SBZ │ grant │ compound ││ send flexpage │
└──────────┴─────┴──────┴───┴─────┴───────┴──────────┘└───────────────┘

SBZ means “should be zero”.

For details, see IPC Message registers.

Receive items

A non-void receive item consists of up to three words.

If not void, the general layout of the first word is defined as follows:

 MSB                      4  3      2       1      0    bits
┌──────────────────────────┬───┬────────┬───────┬─────┐
│                          │ 1 │ rcv_id │ small │ fwd │
└──────────────────────────┴───┴────────┴───────┴─────┘

The small and fwd bits determine the details of the layout of the whole message item.

If small is unset, then also rcv_id must be unset, and the most significant bits should be zero:

┌──────────────────────────┬───┬────────┬───────┬─────┐
│   SBZ (should be zero)   │ 1 │   0    │   0   │ fwd │
└──────────────────────────┴───┴────────┴───────┴─────┘

If small is set, the most significant bits are layouted as follows:

 MSB        12 11         4  3      2       1      0    bits
┌─────────────┬────────────┬───┬────────┬───────┬─────┐
│ rcv cap idx │    SBZ     │ 1 │ rcv_id │   1   │ fwd │
└─────────────┴────────────┴───┴────────┴───────┴─────┘

At most one of rcv_id and fwd may be set.

The number and meaning of the words in the whole item are determined by the small and fwd bits:

     first word         second word           third word
 rcv_id small  fwd
─┬─────┬─────┬─────┐┌───────────────────┐
 │  0  │  0  │  0  ││   rcv flexpage    │
─┴─────┴─────┴─────┘└───────────────────┘            12 11  0
─┬─────┬─────┬─────┐┌───────────────────┐┌─────────────┬─────┐
 │  0  │  0  │  1  ││   rcv flexpage    ││ fwd cap idx │ SBZ │
─┴─────┴─────┴─────┘└───────────────────┘└─────────────┴─────┘
─┬─────┬─────┬─────┐
 │ 0/1 │  1  │  0  │
─┴─────┴─────┴─────┘            12 11  0
─┬─────┬─────┬─────┐┌─────────────┬─────┐
 │  0  │  1  │  1  ││ fwd cap idx │ SBZ │
─┴─────┴─────┴─────┘└─────────────┴─────┘

The meaning of the bits in detail:

Return items

A return item always consists of two words. The general layout of a non-void return item is defined as follows:

                  first word                       second word
 MSB     12 11    6 5    4  3  2        1  0                        bits
┌──────────┬───────┬──────┬───┬──────────┬───┐┌───────────────────┐
│ hot_spot │ order │ type │ 1 │ rcv_type │ c ││      payload      │
└──────────┴───────┴──────┴───┴──────────┴───┘└───────────────────┘
└──────────┘              └───┘          └───┘ from send item’s first word
           └──────────────┘                    from send item’s flexpage
                              └──────────┘     initially zero

As indicated above, the hot_spot, 1, and c (compound) are copied from the sender’s send item’s first word, and order and type are copied from the sender’s send item’s flexpage. The rcv_type and payload are determined by what is actually transferred, which is also affected by the rcv_id bit in the receiver’s receive item. The rcv_type determines the content and layout of the payload.

There are four cases for rcv_type:

00: Used if at least one mapping was actually transferred for the corresponding send item. The payload is undefined: (also see L4::Ipc::Snd_fpage::cap_received()):

┌──────────┬───────┬──────┬───┬──────────┬───┐┌───────────────────┐
│ hot_spot │ order │ type │ 1 │    00    │ c ││     undefined     │
└──────────┴───────┴──────┴───┴──────────┴───┘└───────────────────┘

01: Used if transfer of mappings was attempted, but actually nothing was transferred, because nothing was mapped on the sender’s side for the corresponding send item. The payload is undefined:

┌──────────┬───────┬──────┬───┬──────────┬───┐┌───────────────────┐
│ hot_spot │ order │ type │ 1 │    01    │ c ││     undefined     │
└──────────┴───────┴──────┴───┴──────────┴───┘└───────────────────┘

10: Used if the receive item’s rcv_id bit was set and the conditions for transferring an IPC gate label were fulfilled. In that case, no mapping is done for this item and the payload consists of the bitwise OR (|) of the IPC gate label and the write and special permissions (see L4_CAP_FPAGE_W and L4_CAP_FPAGE_S) that would have been mapped (also see L4::Ipc::Snd_fpage::id_received()):

                                                        2 1      0  bits
┌──────────┬───────┬──────┬───┬──────────┬───┐┌──────────┬────────┐
│ hot_spot │ order │ type │ 1 │    10    │ c ││   label  │ rights │
└──────────┴───────┴──────┴───┴──────────┴───┘└──────────┴────────┘

11: Used if the receive item’s rcv_id bit was set and the conditions for transferring the sender’s flexpage word were fulfilled. In that case, no mapping is done for this item and the payload is a copy of the sender’s flexpage word (also see L4::Ipc::Snd_fpage::local_id_received()):

┌──────────┬───────┬──────┬───┬──────────┬───┐┌───────────────────┐
│ hot_spot │ order │ type │ 1 │    11    │ c ││   send flexpage   │
└──────────┴───────┴──────┴───┴──────────┴───┘└───────────────────┘