Instances of lazy scheduling and timeslice donation in L4 implementations

List overview All Threads
Download

newer

older

hello and question from a l4...

log_net does not work

Sergio Ruocco

20 Jan 2006 20 Jan '06

10:21 a.m.

According to Gernot's "Inside L4/MIPS" Version 2.19 of January 30, 2001 https://www.ertos.nicta.com.au/publications/papers/Heiser:IL4.abstract.pml

Section 2.1.2 (page 7): Time-slice donation can happen in one of two ways: • explicitly via the thread switch system call. This call donates the remainder of the caller’s current time [...] • implicitly via IPC. IPC operations are often accompanied by a context switch from the sender to the receiver, in which case the sender’s current time slice is implicitly donated to the receiver.

and

Section 5.2.3: " The other point to note is that, while the sender thread is put into the busy list to allow it to be scheduled again, the context switch to the receiver is actually performed without any scheduling (lazy scheduling [Lie93]). The receiver simply continues in the remainder of the sender’s time slice. This is an instance of time-slice donation in L4."

I assume that the above described cases still hold for most, if not all, the current L4 implementation(s). The IPC time-slice donation is a classic optimisation, and ThreadSwitch() is part of the L4 specification.

I have few Qs:

1) Besides the ones above, are there in the current L4 implementations other instances of: 1.1) time-slice donation ? 1.2) lazy-scheduling ?

2) If yes, do they affect or are affected in special ways by [kernel] interrupt threads ? (I think not, but with implementation/optimisations you never know...)

Thanks

Sergio

-- -- http://www.cse.unsw.edu.au/~sruocco/ ERTOS Researcher Lecturer National ICT Australia Ltd. University of New South Wales This email and any attachments are confidential. They may contain legally privileged information or copyright material. You should not read, copy, use or disclose them without authorisation. If you are not an intended recipient, please contact us at once by return email and then delete both messages. We do not accept liability in connection with computer virus, data corruption, delay, interruption, unauthorised access or unauthorised amendment. This notice should not be removed.

Show replies by date

Gernot Heiser

20 Jan 20 Jan

1:18 p.m.

Note that the Inside L4 document refers to the old MIPS kernel, which is one implementation of (and extended) V2 API.

It also, like many other implementations, does time-slice donation incorrectly (i.e. blindly). Time-slice donation should respect priorities. I believe that this is fixed in NICTA::Pistachio-embedded, but the code is the ultimate reference ;-)

Gernot

...

...
...
...
...
On Fri, 20 Jan 2006 20:21:32 +1100, Sergio Ruocco sergio.ruocco@nicta.com.au said:

SR> According to Gernot's "Inside L4/MIPS" Version 2.19 of January 30, 2001 SR> https://www.ertos.nicta.com.au/publications/papers/Heiser:IL4.abstract.pml

SR> Section 2.1.2 (page 7): SR> Time-slice donation can happen in one of two ways: SR> � explicitly via the thread switch system call. This call donates the SR> remainder of the caller�s current time [...] SR> � implicitly via IPC. IPC operations are often accompanied by a context SR> switch from the sender to the receiver, in which case the sender�s SR> current time slice is implicitly donated to the receiver.

SR> and

SR> Section 5.2.3: SR> " The other point to note is that, while the sender thread is put into SR> the busy list to allow it to be scheduled again, the context switch to SR> the receiver is actually performed without any scheduling (lazy SR> scheduling [Lie93]). The receiver simply continues in the remainder of SR> the sender�s time slice. This is an instance of time-slice donation in L4."

SR> I assume that the above described cases still hold for most, if not all, SR> the current L4 implementation(s). The IPC time-slice donation is a SR> classic optimisation, and ThreadSwitch() is part of the L4 specification.

SR> I have few Qs:

SR> 1) Besides the ones above, are there in the current L4 implementations SR> other instances of: SR> 1.1) time-slice donation ? SR> 1.2) lazy-scheduling ?

SR> 2) If yes, do they affect or are affected in special ways by [kernel] SR> interrupt threads ? SR> (I think not, but with implementation/optimisations you never know...)

SR> Thanks

SR> Sergio

SR> --

SR> http://www.cse.unsw.edu.au/~sruocco/ SR> ERTOS Researcher Lecturer SR> National ICT Australia Ltd. University of New South Wales

SR> This email and any attachments are confidential. They may contain SR> legally privileged information or copyright material. You should not SR> read, copy, use or disclose them without authorisation. If you are SR> not an intended recipient, please contact us at once by return email SR> and then delete both messages. We do not accept liability in SR> connection with computer virus, data corruption, delay, interruption, SR> unauthorised access or unauthorised amendment. This notice should not SR> be removed.

SR> _______________________________________________ SR> l4-hackers mailing list SR> l4-hackers@os.inf.tu-dresden.de SR> http://os.inf.tu-dresden.de/mailman/listinfo/l4-hackers

Udo A. Steinberg

2:04 p.m.

On Fri, 20 Jan 2006 23:18:28 +1100 (EST) Gernot Heiser (GH) wrote:

GH> Note that the Inside L4 document refers to the old MIPS kernel, which GH> is one implementation of (and extended) V2 API. GH> GH> It also, like many other implementations, does time-slice donation GH> incorrectly (i.e. blindly). Time-slice donation should respect GH> priorities. I believe that this is fixed in NICTA::Pistachio-embedded, GH> but the code is the ultimate reference ;-)

Also note that time-slice donation on L4 typically does not extend across preemptions. A client that performs an RPC to a server donates its time quantum and priority to the server - this is achieved by switching only the execution context during the IPC and leaving the scheduling parameters of the client "in place". However, if the RPC is preempted by a higher- priority thread and is later resumed, the server will continue executing using its own time and priority instead of using the client's time and priority.

To my knowledge no existing L4 implementation correctly resumes the original donation after a preemption. A solution has been described in [1], but we at TUD have not implemented it yet.

- Udo

[1] http://doi.ieeecomputersociety.org/10.1109/ECRTS.2005.16

Gernot Heiser

22 Jan 22 Jan

12:02 p.m.

...

...
...
...
...
On Fri, 20 Jan 2006 14:04:03 +0100, "Udo A. Steinberg" us15@os.inf.tu-dresden.de said:

UAS> [...] UAS> Also note that time-slice donation on L4 typically does not extend across UAS> preemptions. A client that performs an RPC to a server donates its time UAS> quantum and priority to the server - this is achieved by switching only the UAS> execution context during the IPC and leaving the scheduling parameters UAS> of the client "in place". However, if the RPC is preempted by a higher- UAS> priority thread and is later resumed, the server will continue executing UAS> using its own time and priority instead of using the client's time and UAS> priority.

UAS> To my knowledge no existing L4 implementation correctly resumes the original UAS> donation after a preemption. A solution has been described in [1], but we at UAS> TUD have not implemented it yet.

You imply that donation across preemptions is the "correct" behaviour. This is at least debated. In fact, the original motivation is not at all to donate a time slice, but to do lazy scheduling, i.e. do not invoke the scheduler if it isn't required(*). This is not proper time-slice donation (even though my old Inside-L4 document might be seen to imply that it is).

Gernot

(*) scheduler invocation isn't required because (assuming prios are handled correctly, which in many implementations aren't) dispatching the receiver is the correct (and in cases of ambiguity, the most efficient) thing to do. The drawback is that it makes CPU time accounting inaccurate.

Sergio Ruocco

24 Jan 24 Jan

6 a.m.

Gernot Heiser wrote:

...

...
...
...
...
...
On Fri, 20 Jan 2006 14:04:03 +0100, "Udo A. Steinberg" us15@os.inf.tu-dresden.de said:

You imply that donation across preemptions is the "correct" behaviour. This is at least debated. In fact, the original motivation is not at all to donate a time slice, but to do lazy scheduling,

A different slant on 'proper' timeslice donation.

Situation:

Thread A has a 'very long' timeslice (set by L4_Schedule()), and is currently running on the CPU.

Thread B has a standard timeslice, is active (ready), but not running, not because is blocked on IPC, but just because it has a priority lower than A.

Thread A ... L4_ThreadSwitch ( Thread B ); ...

According to these L4 specifications:

http://l4hq.org/docs/manuals/l4-x2-20051021.pdf http://www.ertos.nicta.com.au/software/kenge/pistachio/latest/refman.pdf

-------------------------------------------------------- THREADSWITCH [Systemcall] The invoking thread releases the processor (non-preemptively) so that another ready thread can be processed.

Input Parameter

1) dest == nilthread Processing switches to an undeﬁned ready thread which is selected by the scheduler. (It might be the invoking thread.) Since this is “ordinary” scheduling, the thread gets a new timeslice.

2) dest != nilthread If dest is ready, processing switches to this thread. In this “extraordinary” scheduling, the invoking thread donates its remaining timeslice to the destination thread. (This one gets the donation in addition to its ordinarily scheduled timeslices, if any.) ... --------------------------------------------------------

Since the correctness of some software I am writing may end relying on the behaviour _specified_ at point 2) above, I have the following Q:

Do L4 implementations _really_ honour case 2) ? In other words, will Thread B run as long as prescribed by Thread A timeslice ?

Local experts believe that current L4 implementation(s) do not conform to specification, and Thread B will run only until the next timertick.

Comments ?

Sergio

Udo A. Steinberg

12:20 p.m.

On Tue, 24 Jan 2006 16:00:24 +1100 Sergio Ruocco (SR) wrote:

SR> THREADSWITCH [Systemcall] SR> The invoking thread releases the processor (non-preemptively) so that SR> another ready thread can be processed. SR> SR> 2) dest != nilthread SR> If dest is ready, processing switches to this thread. SR> In this “extraordinary” scheduling, the invoking thread donates its SR> remaining timeslice to the destination thread. (This one gets the SR> donation in addition to its ordinarily scheduled timeslices, if any.) SR> SR> Since the correctness of some software I am writing may end relying on SR> the behaviour _specified_ at point 2) above, I have the following Q: SR> SR> Do L4 implementations _really_ honour case 2) ? In other words, will SR> Thread B run as long as prescribed by Thread A timeslice ? SR> SR> Local experts believe that current L4 implementation(s) do not conform SR> to specification, and Thread B will run only until the next timertick.

Your local experts are correct. In the L4 implementations that I know of B will only run on A's time slice until a higher-priority thread preempts B. I think this is what you meant by "next timer tick" (but note that the next timer tick may wake up a lower-priority thread which does not preempt B - therefore, preemption is what ultimately breaks the donation, not the timer tick per se).

Btw., this is the exact same issue as the IPC donation we discussed a few days ago - the kernel does not maintain enough information to restore the donation to B when A's time slice is selected by the scheduler again.

- Udo

Sergio Ruocco

2:55 p.m.

Udo A. Steinberg wrote:

...

On Tue, 24 Jan 2006 16:00:24 +1100 Sergio Ruocco (SR) wrote: SR> Local experts believe that current L4 implementation(s) do not conform SR> to specification, and Thread B will run only until the next timertick.

...

Your local experts are correct. In the L4 implementations that I know of B will only run on A's time slice until a higher-priority thread preempts B. I think this is what you meant by "next timer tick" (but note that the next timer tick may wake up a lower-priority thread which does not preempt B - therefore, preemption is what ultimately breaks the donation, not the timer tick per se).

Local experts said 'timertick', but they were busy with other things, so we may want to take a close look at the source and/or ask what they meant exactly. I also found the answer 'timertick' a bit strange, as I would expect donation to be terminated only by:

1) natural end of augmented timeslice 2) preemption by higher-priority thread

...

Btw., this is the exact same issue as the IPC donation we discussed a few days ago - the kernel does not maintain enough information to restore the donation to B when A's time slice is selected by the scheduler again. - Udo

On the one hand, remember that according to the specification the part of timeslice that A donated to B via L4_ThreadSwitch() *adds* to B's natural timeslice, so the kernel should have provision for this donation-preserving behaviour. Again, a close look to the source seems necessary.

On the other hand, as Gernot pointed out, A's timeslice donation to B when A IPCs to B is simply a *side* *effect* of lazy scheduling, a speed optimisation that avoids to run the scheduler in the IPC path. Being the timeslice donation a mere side-effect, actively carrying it over across preemptions may probably not be the required behaviour.

Sergio

Udo A. Steinberg

3:49 p.m.

On Wed, 25 Jan 2006 00:55:27 +1100 Sergio Ruocco (SR) wrote:

SR> Local experts said 'timertick', but they were busy with other things, so SR> we may want to take a close look at the source and/or ask what they SR> meant exactly. I also found the answer 'timertick' a bit strange, as I SR> would expect donation to be terminated only by: SR> SR> 1) natural end of augmented timeslice SR> 2) preemption by higher-priority thread

Correct.

SR> On the one hand, remember that according to the specification the part SR> of timeslice that A donated to B via L4_ThreadSwitch() *adds* to B's SR> natural timeslice, so the kernel should have provision for this SR> donation-preserving behaviour. Again, a close look to the source seems SR> necessary.

The specification is not very precise in defining the behavior. Imagine A having a high priority and B having a low one. Simply adding A's time quantum to that of B means you downgrade A's high-priority time to the low priority of B, which has different semantics than attaching A's scheduling parameters (e.g., time quantum coupled with priority) to B. In the latter case B can run with A's high priority until A's time quantum is depleted.

SR> On the other hand, as Gernot pointed out, A's timeslice donation to B SR> when A IPCs to B is simply a *side* *effect* of lazy scheduling, a speed SR> optimisation that avoids to run the scheduler in the IPC path.

I think Jochen's idea of lazy scheduling was primarily focused on not having to manipulate the ready queue during IPC. What Gernot probably meant is that switching from A to B without also switching to B's scheduling parameters means you have to load less state from B, which improves performance.

SR> Being the SR> timeslice donation a mere side-effect, actively carrying it over across SR> preemptions may probably not be the required behaviour.

The "side effect" of L4 IPC only implements desired donation behavior in the case when a high-priority thread A calls a low-priority thread B. In that case B will be boosted to A's priority and priority inversion is avoided. However, if A has the lower priority and calls a high-priority thread B, then A effectively drags B down to its priority level, which is imho not desired. In that case the IPC implementation should switch to B's scheduling parameters or boost the priority of A's time quantum to the priority of B for the duration of the IPC.

- Udo

Espen Skoglund

8:59 p.m.

[Udo A Steinberg]

...

The "side effect" of L4 IPC only implements desired donation behavior in the case when a high-priority thread A calls a low-priority thread B. In that case B will be boosted to A's priority and priority inversion is avoided. However, if A has the lower priority and calls a high-priority thread B, then A effectively drags B down to its priority level, which is imho not desired. In that case the IPC implementation should switch to B's scheduling parameters or boost the priority of A's time quantum to the priority of B for the duration of the IPC.

This can be argued about. If high-priority thread B is in no way penalized by running on a dontated lower priority timeslice for a while, then it is not necessarily a bad thing. Once the timeslice expires or is preempted, and if thread B is now eligible to be scheduled then it will continue to run on its own timeslice with its own priority.

Now for how things are implemented in Pistachio:

o If send phase finished and there is *no receive phase* the kernel will donate the timeslice to the destination thread *if* the destination runs on a higher priority. If destination does not run on a higher priority the current thread will contine executing its own timeslice.

o If receive partner is not ready and there was a send phase (remember send and receive partner may be different) we block and donate the timeslice to the send partner.

o When the receive IPC copy operation occurs we switch to the receive partner (the IPC copy operation always happens in the context of the sender) if it has a higher priority than the send partner. Otherwise we switch to the send partner. Again, timeslice donation occurs.

o After the receive phase has finished the scheduling is decided upon by the party doing the last send operation (see the two first bullets above).

Hope this clears up a few things regarding the Pistachio implementation. Note that this is pretty much just from the top of my head, though. There might be some slight differences in the actual implementation. Also note that other things such as delayed preemtion also complicate these scheduling decisions.

eSk

Udo A. Steinberg

9:29 p.m.

On Tue, 24 Jan 2006 20:59:21 +0100 Espen Skoglund (ES) wrote:

ES> This can be argued about. If high-priority thread B is in no way ES> penalized by running on a dontated lower priority timeslice for a ES> while, then it is not necessarily a bad thing.

I consider B to be penalized if it can be preempted by a thread with a lower priority than that of B.

ES> Once the timeslice ES> expires or is preempted, and if thread B is now eligible to be ES> scheduled then it will continue to run on its own timeslice with its ES> own priority.

That way B is not penalized (according to the definition above), but there is no clean accounting. Clearly B is handling a request on behalf of A - and initially the consumed time is accounted to A. After a preemption B continues to run with its own priority and consumes its own time. Thus, some or all of the time required to handle the request will be accounted to the client, some or none to the server. Also you can incur an unnecessary preemption of B (while it's running on A's priority) if a thread between A's and B's priority is released (admittedly, this is not a common case).

Imho you want to either account the consumed time completely to the client or completely to the server.

ES> Now for how things are implemented in Pistachio: ES> ES> o If send phase finished and there is *no receive phase* the kernel ES> will donate the timeslice to the destination thread *if* the ES> destination runs on a higher priority. If destination does not ES> run on a higher priority the current thread will contine executing ES> its own timeslice.

What is the motivation for donating to the destination thread of a send-only IPC if the destination has a higher priority? A pure send operation does not establish a dependency between both parties, i.e. the caller does not wait for the callee to reply.

ES> o If receive partner is not ready and there was a send phase ES> (remember send and receive partner may be different) we block and ES> donate the timeslice to the send partner.

The receive partner is the party to which you send or from which you receive? This is a bit confusing because during request and reply the sender and receiver roles of both parties switch.

- Udo

Espen Skoglund

9:45 p.m.

[Udo A Steinberg]

...

On Tue, 24 Jan 2006 20:59:21 +0100 Espen Skoglund (ES) wrote:

ES> This can be argued about. If high-priority thread B is in no way ES> penalized by running on a dontated lower priority timeslice for a ES> while, then it is not necessarily a bad thing.

...

I consider B to be penalized if it can be preempted by a thread with a lower priority than that of B.

Right. But if the timeslice donated by A is preemted by a priority between A and B, then thread B will start its timeslice because it has the higher priority.

ES> Once the timeslice expires or is preempted, and if thread B is now ES> eligible to be scheduled then it will continue to run on its own ES> timeslice with its own priority.

...

That way B is not penalized (according to the definition above), but there is no clean accounting. Clearly B is handling a request on behalf of A - and initially the consumed time is accounted to A. After a preemption B continues to run with its own priority and consumes its own time. Thus, some or all of the time required to handle the request will be accounted to the client, some or none to the server. Also you can incur an unnecessary preemption of B (while it's running on A's priority) if a thread between A's and B's priority is released (admittedly, this is not a common case).

...

Imho you want to either account the consumed time completely to the client or completely to the server.

You're assuming that B is only performing work on behalf of A. This is not necessarily true. You're also moving into an area where you're not simply talking about simple timeslice donation anymore. This is all fine, but does require a complete redesign of how we do scheduling in L4.

ES> Now for how things are implemented in Pistachio: ES> ES> o If send phase finished and there is *no receive phase* the ES> kernel will donate the timeslice to the destination thread *if* ES> the destination runs on a higher priority. If destination does ES> not run on a higher priority the current thread will contine ES> executing its own timeslice.

...

What is the motivation for donating to the destination thread of a send-only IPC if the destination has a higher priority? A pure send operation does not establish a dependency between both parties, i.e. the caller does not wait for the callee to reply.

Because both source thread and destination thread are now runnable, and destination thread should run because it has higher priority. Sure, you may have wanted to skip the timeslice donation here, but we don't do that for reasons of efficiency. Also, if destination does perform work on behalf of sender then timeslice donation is probably what you want anyway.

ES> o If receive partner is not ready and there was a send phase ES> (remember send and receive partner may be different) we block and ES> donate the timeslice to the send partner.

...

The receive partner is the party to which you send or from which you receive? This is a bit confusing because during request and reply the sender and receiver roles of both parties switch.

The receive partner is the thread which you receive from.

eSk

Sergio Ruocco

25 Jan 25 Jan

2:01 a.m.

Udo A. Steinberg wrote:

...

The specification is not very precise in defining the behavior. Imagine A having a high priority and B having a low one. Simply adding A's time quantum to that of B means you downgrade A's high-priority time to the low priority of B, which has different semantics than attaching A's scheduling parameters (e.g., time quantum coupled with priority) to B. In the latter case B can run with A's high priority until A's time quantum is depleted.

Beware that Quantum and Timeslice are not the same thing. Quantum is the amount of execution time after which the thread scheduler receives an RPC. After the RPC the Quantum is re-initialised to the value of thread's TimeSlice, but the thread scheduler is free to set it to any value, including oo, independently from timeslice.

So, to avoid confusion, let's keep Quantum outside of this discussion.

...

SR> Being the SR> timeslice donation a mere side-effect, actively carrying it over across SR> preemptions may probably not be the required behaviour.

The "side effect" of L4 IPC only implements desired donation behavior in the case when a high-priority thread A calls a low-priority thread B. In that case B will be boosted to A's priority and priority inversion is avoided.

No. I think that with IPC Lazy Scheduling B does not get a priority boost, but only to run for the remaining part of A timeslice. This is not the same as B running with A priority.

In fact, as soon as a thread C of intermediate priority between A and B is ready to run, it immediately preempts B, because B was running on A's timeslice, not on A's priority.

...

However, if A has the lower priority and calls a high-priority thread B, then A effectively drags B down to its priority level, which is imho not

Again, I think that priority is really not involved here.

...

desired. In that case the IPC implementation should switch to B's scheduling parameters or boost the priority of A's time quantum to the priority of B for

What do mean with the "priority of A's time quantum" ??

...

the duration of the IPC.

Udo

If you mean to run B on A's scheduling parameters for the entire duration of the RPC call, I think that this would go waaay past the intentions of original IPC Lazy Scheduling, and seriously confuse scheduling.

Sergio

Udo A. Steinberg

3:14 a.m.

On Wed, 25 Jan 2006 12:01:22 +1100 Sergio Ruocco (SR) wrote:

SR> Beware that Quantum and Timeslice are not the same thing. Quantum is the SR> amount of execution time after which the thread scheduler receives an SR> RPC. After the RPC the Quantum is re-initialised to the value of SR> thread's TimeSlice, but the thread scheduler is free to set it to any SR> value, including oo, independently from timeslice. SR> SR> So, to avoid confusion, let's keep Quantum outside of this discussion.

We are using conflicting terminology here. What I refer to as a time slice or scheduling context is a time quantum Q coupled with a priority P. A thread with such a time slice can run with priority P for Q units of time.

SR> No. I think that with IPC Lazy Scheduling B does not get a priority SR> boost, but only to run for the remaining part of A timeslice. This is SR> not the same as B running with A priority.

I guess it is implementation-dependent. In our Fiasco kernel there is a strict distinction between time slices and execution contexts. During an IPC with lazy scheduling the kernel switches execution contexts but not time slices. This means that both A's priority and time quantum remain in effect - and thus B runs with A's priority and consumes A's time quantum until the time quantum is depleted and a new time slice is selected by the scheduler.

SR> In fact, as soon as a thread C of intermediate priority between A and B SR> is ready to run, it immediately preempts B, because B was running on A's SR> timeslice, not on A's priority.

Not in our kernel. An intermediate thread C can be woken up but it cannot preempt B as long as A's time slice hasn't run out.

SR> What do mean with the "priority of A's time quantum" ??

In our model every time quantum is coupled with a priority. If it doesn't become clear from the description above I can elaborate some more.

- Udo

Sergio Ruocco

5:02 a.m.

Udo A. Steinberg wrote:

...

On Wed, 25 Jan 2006 12:01:22 +1100 Sergio Ruocco (SR) wrote:

SR> Beware that Quantum and Timeslice are not the same thing. Quantum is the SR> amount of execution time after which the thread scheduler receives an SR> RPC. After the RPC the Quantum is re-initialised to the value of SR> thread's TimeSlice, but the thread scheduler is free to set it to any SR> value, including oo, independently from timeslice. SR> SR> So, to avoid confusion, let's keep Quantum outside of this discussion.

We are using conflicting terminology here. What I refer to as a time slice or scheduling context is a time quantum Q coupled with a priority P. A thread with such a time slice can run with priority P for Q units of time.

I was referring to the definitions of Timeslice and Quantum given in these L4::Pistachio and N1 specifications [1]. From your description seems that what you call quantum is what these specifications call timeslice.

...

I guess it is implementation-dependent. In our Fiasco kernel there is a strict distinction between time slices and execution contexts. During an IPC with lazy scheduling the kernel switches execution contexts but not time slices.

...nor priorities (or what follows is incorrect).

...

This means that both A's priority and time quantum remain in effect - and thus B runs with A's priority and consumes A's time quantum

[2]

...

until the time quantum is depleted and a new time slice is selected by the scheduler.

Here we enter in a tricky area. What is the difference from quantum and timeslice from your perspective ?

...

SR> In fact, as soon as a thread C of intermediate priority between A and B SR> is ready to run, it immediately preempts B, because B was running on A's SR> timeslice, not on A's priority.

Not in our kernel. An intermediate thread C can be woken up but it cannot preempt B as long as A's time slice hasn't run out.

...as long as A's time slice...or quantum ? To me they are not synonyms. Moreover, writing 'timeslice' contradicts [2] above, unless Fiasco conflates the notion of quantum and timeslice.

I keep asking because from the L4::Pistachio specification [1] they seem to be quite different concepts. The end of a timeslice causes a preemption. The end of a quantum causes an IPC to the thread scheduler.

...

SR> What do mean with the "priority of A's time quantum" ??

In our model every time quantum is coupled with a priority. If it doesn't become clear from the description above I can elaborate some more.

Udo

OK, what I understood from this discussion is that Fiasco follows a specification that differs from [1], not only in substance, but also in key terminology, and this can easily lead to confusion. I will read more about Fiasco, and keep it consideration in the following discussion.

Sergio

[1] http://l4hq.org/docs/manuals/l4-x2-20051021.pdf http://www.ertos.nicta.com.au/software/kenge/pistachio/latest/refman.pdf

Espen Skoglund

24 Jan 24 Jan

7:54 p.m.

[Sergio Ruocco]

...

dest != nilthread

If dest is ready, processing switches to this thread. In this âextraordinaryâ scheduling, the invoking thread donates its remaining timeslice to the destination thread. (This one gets the donation in addition to its ordinarily scheduled timeslices, if any.) ...

...

Since the correctness of some software I am writing may end relying on the behaviour _specified_ at point 2) above, I have the following Q:

...

Do L4 implementations _really_ honour case 2) ? In other words, will Thread B run as long as prescribed by Thread A timeslice ?

...

Local experts believe that current L4 implementation(s) do not conform to specification, and Thread B will run only until the next timertick.

...

Comments ?

Speaking for the Pistachio implementation: The time is accounted to the thread who currently "owns" the timeslice (thread A in this case). Each time a timer tick occurs, and there are no preemptions by higher priority threads, thread B will run on the time donated by thread A. This is according to the spec. Once the timeslice expires a new scheduling decision is made. This may or may not cause thread B to continue running (depending on priorities, scheduling queues, etc.). The "in addition to its ordinarily scheduled timeslices" in the spec does not imply that it will also be the next thread scheduled. It merely means that the time it was donated is not accounted to its own timeslices.

eSk

Udo A. Steinberg

9:34 p.m.

On Tue, 24 Jan 2006 19:54:29 +0100 Espen Skoglund (ES) wrote:

ES> Speaking for the Pistachio implementation: The time is accounted to ES> the thread who currently "owns" the timeslice (thread A in this case). ES> Each time a timer tick occurs, and there are no preemptions by higher ES> priority threads, thread B will run on the time donated by thread A. ES> This is according to the spec.

So far all is clear.

ES> Once the timeslice expires a new ES> scheduling decision is made. This may or may not cause thread B to ES> continue running (depending on priorities, scheduling queues, etc.).

This is also clear.

The tricky case is as follows: 1) A performs a thread_switch syscall, donating its time slice to B 2) B executes and consumes part of A's time (but not all) 3) B is preempted by a higher-priority thread H 4) H blocks

How does B get to consume the remainder of A's time slice, i.e. the part that it did not use up in 2) ?

- Udo

Espen Skoglund

9:54 p.m.

[Udo A Steinberg]

...

On Tue, 24 Jan 2006 19:54:29 +0100 Espen Skoglund (ES) wrote:

...

So far all is clear.

ES> Once the timeslice expires a new ES> scheduling decision is made. This may or may not cause thread B to ES> continue running (depending on priorities, scheduling queues, etc.).

...

This is also clear.

...

The tricky case is as follows:

A performs a thread_switch syscall, donating its time slice to B

B executes and consumes part of A's time (but not all)

B is preempted by a higher-priority thread H

H blocks

...

How does B get to consume the remainder of A's time slice, i.e. the part that it did not use up in 2) ?

Sorry for not being precise enough here. True, if preemted while running on the donated timeslice the remainder of the timeslice will no longer be donated. One could think of a scheme that maintained the timeslice donation state across such preemptions. One might even come up with a solution that does not require any modifications on the fast IPC path. It would complicate the scheduling code a little, though. How this would affect the scheduler performance I can not tell.

eSk

Kevin Elphinstone

25 Jan 25 Jan

1:09 a.m.

Arrgghhh, here we go again.... I don't have enough time in my life for yet another scheduling debate, so I'll summarize my view in the hope that it is helpful (note I'm only roughly following the thread, so aplogies in advance if I missed the point).

There are roughly three approaches to scheduling in practice (plus lots more involving substantial changes to the typical multi-level round robin in place)

* System-level best effort Chasing outright performance using the lazy scheduling technique that is priority ambivelant, historically what L4 has done.

* System-level priority ceiling protocol (static assignment of priorities) with priority preserving lazy scheduling, i.e. lazy switch only happens if destination is (equal) highest priority thread.

- I beleive we have something that approximates this at NICTA, maybe dresden has this also.

* system-level priority inheritance (or call it persistent time-slice donation if you wish), this is what I believe dresden is talking about, has proposed, and I gather have not implemented. It requires tracking of dependencies of IPC in some manner.

My point, everybody is "correct" depending on what you are doing/expecting at the system level.

So rather than argue over details of how lazy-scheduling "should" or does behave, one should have a clear idea of what is required at the system level, and what model of the three above is required (or maybe an alternative), or maybe an existing model with restrictions on what one can do.

And no matter which model you require (except for maybe the first), your likely to have to audit the code to confirm your expected behaviour if your relying on it.

cheers

- Kevin

Udo A. Steinberg

1:45 a.m.

On Wed, 25 Jan 2006 11:09:09 +1100 Kevin Elphinstone (KE) wrote:

KE> * System-level priority ceiling protocol (static assignment of KE> priorities) with priority preserving lazy scheduling, i.e. lazy switch KE> only happens if destination is (equal) highest priority thread. KE> KE> - I beleive we have something that approximates this at NICTA, maybe KE> dresden has this also.

I don't believe we have something like this in Dresden or I'm not understanding why lazy switching should be done only if destination is equal highest-priority thread in this case.

KE> * system-level priority inheritance (or call it persistent time-slice KE> donation if you wish), this is what I believe dresden is talking about, KE> has proposed, and I gather have not implemented. It requires tracking of KE> dependencies of IPC in some manner.

The proposed mechanism can do priority inheritance and stack-based priority ceiling on top of L4 with some tracking added to the IPC path.

KE> My point, everybody is "correct" depending on what you are KE> doing/expecting at the system level.

Sure.

KE> So rather than argue over details of how lazy-scheduling "should" or KE> does behave, one should have a clear idea of what is required at the KE> system level, and what model of the three above is required (or maybe an KE> alternative), or maybe an existing model with restrictions on what one KE> can do.

There has been no argument but rather a discussion of issues that exist in the current L4 implementations. Whether these issues matter in a particular system design is entirely dependent on what your real-time requirements are or rather, whether you have any.

I know these issues have been debated to death in the past and to some of us the implications are pretty clear - but I didn't start this thread ;)

- Udo

Sergio Ruocco

2:14 a.m.

New subject: Instances of lazy scheduling and timeslice donation in L4 implementations

Kevin Elphinstone wrote:

...

Arrgghhh, here we go again.... I don't have enough time in my life for yet another scheduling debate, so I'll summarize my view in the hope

Probably the second classic debate after Emacs vs vi !

...

So rather than argue over details of how lazy-scheduling "should" or

I am just trying to understand what's there in the gulf between specifications and implementations of L4 scheduling, an area which look subtly underspecified and/or mis-implemented. The more debate and POV we get on it, the better.

...

And no matter which model you require (except for maybe the first), your likely to have to audit the code to confirm your expected behaviour if your relying on it.

My specific problem WRT scheduling is that I want to be able to rely on behaviour as "in the spec", and not worry about details and optimisations in the various implementations, but probably I am daydreaming...:-)

Sergio

6731

Age (days ago)

6736

Last active (days ago)

l4-hackers@os.inf.tu-dresden.de

19 comments

6 participants

tags (0)

participants (6)

Espen Skoglund
Gernot Heiser
Kevin Elphinstone
Sergio Ruocco
Sergio Ruocco
Udo A. Steinberg