Performance of L4

Adam Lackorzynski adam at
Mon Dec 28 23:54:09 CET 2015


On Fri Dec 25, 2015 at 19:43:13 +0100, Patrick Staeblein wrote:
> I am still writing on my bachelors-thesis on
> the performance of L4/Fiasco and wanted to ask you 
> if you can give me
> some hints and ideas. My goal was to find out how the time-costs are
> divided between ipc, memory access and scheduling 
> and resulting out of
> this what properties programs should have that makes them execute as
> fast as possible. 
> For measuring the ipc-time i used the
> ipc_example.c-file in the ./src/l4/pkg/examples/sys directory and added
> rdtsc (Read Time Stamp Counter)-instructions 
> before and after making
> the ipc-call and then calculating the difference between the two values.
> This was executed in a loop and the resulting values were divided into
> ten equal-sized range-classes depicting them as a histogram.The
> results showed the ipc-time to be pretty stable with almost all values
> falling into one class 

That's ok to do. Please make sure that you build Fiasco with inlining,
and disable debugging and assertions.

> Does
> anyone have an idea what kind of measurement one could do for a
> measurement of the scheduling? I found the function
> l4_scheduler_idle_time (...), but I don't 
> have an idea for using it in
> a concrete program or experiment setup. 

That call just returns the time a CPU is idling, i.e. it does not tell
you what scheduling itself takes. This is a bit difficult to measure.
Scheduling, i.e. choosing a thread to run is triggered either by an
interrupt or by IPC. So you do not get it alone. You could take
timestamps in the kernel directly (e.g. in Context::schedule()) to get
this block on its own.

> I also thought about
> repeatedly measuring the time it takes for one context-switch, but I
> don't know in which place exactly I would have to insert the
> rdtsc-instructions. 

The context switch is in Context::switch_cpu(), or more broadly, it is
called from Context::switch_exec_locked().
What you could also do in user-level is to have two threads with the
same prio and sample tsc in each. When there's a larger gap in the
consecutive tsc read outs (>5ms) you've hit a context switch. When
correlating this with the other thread you'll have some time. But be
aware that you're also measuring interrupt processing with this that
initiated the context switch.

> Does anyone know which exact C-file would have to
> be modified for that? 

The locations to look at are in the context*.cpp files.

Adam                 adam at

More information about the l4-hackers mailing list