On Sun, 14 Aug 2005 22:16:09 +0200 Rene Wittmann (RW) wrote:
RW> I have 2 questions on the RT part in DROPS: RW> RW> 1.) I have 2 RT working threads, each with 3 timeslices: RW> Thread 1: period=10ms RW> ts1=30ms prio1=50 RW> ts2=10ms prio2=30 RW> ts3=20ms prio3=50 RW> RW> Thread 2: period=10ms RW> ts1=20ms prio1=40 RW> ts2=5ms prio2=30 RW> ts3=10ms prio3=40 RW> RW> Assume both threads are ready at the same time. RW> So thread 1 starts and does its work until it RW> releases the voluntarily or it's preempted by RW> its preempter-thread. So say it releases RW> voluntarily after 20ms.
When you say "it's preempted by its preempter thread", do you mean the situation when a preempter with a higher priority than the current thread receives a preemption IPC from the kernel and thus preempts the current thread?
RW> Who will get the left 10ms? I guess nobody.
When a thread is preempted (that means it gets involuntarily descheduled), the remaining time quantum is saved and later restored when the thread is scheduled again. If the thread yields its scheduling context (that means it voluntarily gives it away), the time quantum is no longer available to the thread. A yield can happen in two ways depending on the target thread id specified:
1) if the target ID is the NIL_ID, then the thread yields its active scheduling context to noone - the time is effectively gone. This is what next_reservation does. Note that the kernel checks the user-specified ID, to guard against the case where a thread wants to yield its active scheduling context and the time quantum on that scheduling context expires simultaneously.
2) if the target ID is a valid thread ID in the system, then the action depends on whether the specified thread is ready to run: a) if it is, then the current thread donates the current scheduling context to the specified thread and that thread then becomes the current thread. This is similar to what happens during a donating IPC.
b) if the specified thread is not ready, then the current scheduling context is gone, similar to 1)
RW> Or can I assign my left time to a specific RW> thread, say ts2 of thread 1, that is has 20ms RW> after recognizing that ts1 was finished 10ms RW> earlier?I could do with l4_rt_change_timeslice(), RW> but this would probably not work for the the RW> current period! (or yes?)
A thread can yield the current scheduling context to another thread as described in 2a) above. Note that the thread will yield the current scheduling context (which may not necessarily be the thread's own active scheduling context). As an example consider: A sends a donating IPC to B, now B runs on A's scheduling context. If B yields to C, then C will run on A's time and B will not donate its own scheduling context. If the current scheduling context is in fact the active scheduling context of the donating thread, then realize that the donating thread will likely encounter a timeslice overrun soon after the yield (the donatee will consume the time quantum until it runs out). If you'd rather avoid that, then the donatee will have to yield the scheduling context back to the owner (the original donator) who then has to yield it to noone.
Changes to the time quantum of a scheduling context are only visible the next time that scheduling context becomes the current scheduling context. That means if you change the current scheduling context, then the change will not be visible immediately.
RW> 2.) Consider we have a deadline miss. And we say RW> l4_rt_next_period(): would it wait for the beginning RW> of the next (which means: RW> period 1: deadline miss RW> period 2: call next period (because we think we're RW> still in period 1)+ wait to end of period RW> period 3: normal work) RW> Or do I have to care that we do not call RW> l4_rt_next_period() in case of a deadline miss?
If your thread misses its deadline right before it wanted to call l4_rt_next_period, then l4_rt_next_period blocks the thread until the end of the new period started due to the deadline miss. Since this is likely not what the thread intended to do, it's the preempter's job to handle that situation, e.g. by ex_regs'ing the thread out of it's l4_rt_next_period call.
RW> BTW: I compiled the fiasco-kernel with apic+one shot RW> mode. But this should not be relevant for my question.
It is indeed not relevant.
-Udo.
RW> I have 2 questions on the RT part in DROPS: RW> RW> 1.) I have 2 RT working threads, each with 3 timeslices: RW> Thread 1: period=10ms RW> ts1=30ms prio1=50 RW> ts2=10ms prio2=30 RW> ts3=20ms prio3=50 RW> RW> Thread 2: period=10ms RW> ts1=20ms prio1=40 RW> ts2=5ms prio2=30 RW> ts3=10ms prio3=40 RW> RW> Assume both threads are ready at the same time. RW> So thread 1 starts and does its work until it releases the RW> voluntarily or it's preempted by its preempter-thread. So say it RW> releases voluntarily after 20ms.
When you say "it's preempted by its preempter thread", do you mean the situation when a preempter with a higher priority than the current thread receives a preemption IPC from the kernel and thus preempts the current thread?
That's what I meant. The preempter has a higher priority and receives the IPC and thus "tells" (via IPC or via common variable if in same address space) the thread to stop the work.
RW> Who will get the left 10ms? I guess nobody.
When a thread is preempted (that means it gets involuntarily descheduled), the remaining time quantum is saved and later restored when the thread is scheduled again. If the thread yields its scheduling context (that means it voluntarily gives it away), the time quantum is no longer available to the thread. A yield can happen in two ways depending on the target thread id specified:
- if the target ID is the NIL_ID, then the thread yields its active scheduling context to noone - the time is effectively gone. This is what next_reservation does. Note that the kernel checks
the user-specified ID, to guard against the case where a thread wants to yield its active scheduling context and the time quantum on that scheduling context expires simultaneously.
- if the target ID is a valid thread ID in the system, then
the action depends on whether the specified thread is ready to run: a) if it is, then the current thread donates the current scheduling context to the specified thread and that thread then becomes the current thread. This is similar to what happens during a donating IPC.
b) if the specified thread is not ready, then the current scheduling context is gone, similar to 1)
How can I specify the target ID? When creating the thread? But how?
RW> Or can I assign my left time to a specific thread, say ts2 of thread RW> 1, that is has 20ms after recognizing that ts1 was finished 10ms RW> earlier?I could do with l4_rt_change_timeslice(), but this would RW> probably not work for the the current period! (or yes?)
A thread can yield the current scheduling context to another thread as described in 2a) above. Note that the thread will yield the current scheduling context (which may not necessarily be the thread's own active scheduling context). As an example consider: A sends a donating IPC to B, now B runs on A's scheduling context. If B yields to C, then C will run on A's time and B will not donate its own scheduling context. If the current scheduling context is in fact the active scheduling context of the donating thread, then realize that the donating thread will likely encounter a timeslice overrun soon after the yield (the donatee will consume the time quantum until it runs out). If you'd rather avoid that, then the donatee will have to yield the scheduling context back to the owner (the original donator) who then has to yield it to noone.
Changes to the time quantum of a scheduling context are only visible the next time that scheduling context becomes the current scheduling context. That means if you change the current scheduling context, then the change will not be visible immediately.
OK, that's clear now.
RW> 2.) Consider we have a deadline miss. And we say RW> l4_rt_next_period(): would it wait for the beginning of the next RW> (which means: RW> period 1: deadline miss RW> period 2: call next period (because we think we're RW> still in period 1)+ wait to end of period RW> period 3: normal work) RW> Or do I have to care that we do not call RW> l4_rt_next_period() in case of a deadline miss?
If your thread misses its deadline right before it wanted to call l4_rt_next_period, then l4_rt_next_period blocks the thread until the end of the new period started due to the deadline miss. Since this is likely not what the thread intended to do, it's the preempter's job to handle that situation, e.g. by ex_regs'ing the thread out of it's l4_rt_next_period call.
OK, clear.
Thanks, Rene
A late addition to an old question.
RW> 2.) Consider we have a deadline miss. And we say RW> l4_rt_next_period(): would it wait for the beginning of the next RW> (which means: RW> period 1: deadline miss RW> period 2: call next period (because we think we're RW> still in period 1)+ wait to end of period RW> period 3: normal work) RW> Or do I have to care that we do not call RW> l4_rt_next_period() in case of a deadline miss?
If your thread misses its deadline right before it wanted to call l4_rt_next_period, then l4_rt_next_period blocks the thread until the end of the new period started due to the deadline miss. Since this is likely not what the thread intended to do, it's the preempter's job to handle that situation, e.g. by ex_regs'ing the thread out of it's l4_rt_next_period call.
You're right, it is not intended to wait for the next period. So I introduced some additional checking.
Unfortunatelly I can not recognize deadline misses early enough. This is my configuration: My preempter thread runs at a priority of 255, my rt thread is a periodic thread with period length 40ms. The latter has three timeslices: 1. 10ms, prio=60 2. 18ms, prio=50 3. 10ms, prio=60
All other threads in the system should have a lower priority by default (dm_phys, log, names, simple_ts). I have no logging output or else inside my rt execution.
When I check for timeslice overruns with global variables (set by preempter and read by rt thread) everything seems fine. Whereas when I check for deadline misses via global variables, it doesn't work.
My "real" code looks like this:
extern char deadline_miss; l4_kernel_clock_t left;
set_up_preempter_etc(); do_reservation_and_set_up(); deadline_miss = 0; while(1){ if (deadline_miss == 0){ l4_rt_next_period(); } /* if I check here for deadline_miss != 0 it's true!! */ dealine_miss = 0; work1(); l4_rt_next_reservation(1,&left); work2(); l4_rt_next_reservation(2,&left); work3(); l4_rt_next_reservation(3,&left); }
THE PREEMPTER: extern char deadline_miss; extern char overrun; l4_rt_preemption_t _dw; l4_msgdope_t _result; while(1){ if (l4_ipc_receive(l4_preemption_id(main_thread_id), L4_IPC_SHORT_MSG, &_dw.lh.low, &dw.lh.high, L4_IPC_NEVER, &result) == 0){ if (_dw.p.type == L4_RT_PREEMPT_TIMESLICE){ overrun=1; /* in reality distinguished by id */ } else if (_dw.p.type == L4_RT_PREEMPT_DEADLINE){ deadline_miss = 1; } else{ exit(-1); } }
Summary: when I check for deadline misses right before next_period-call, I have none. So I call next_period. If I check _again_ after the call, I have one!
How could I prevent this? Waiting for the next period is not pretended as Udo stated correctly! Maybe global variables aren't so suitable for this task, but what should I do different? "Ex_regs'ing" would reduce the critical time, I guess, but not totally prevent it?
Maybe my kernel settings are also of note here: APIC with one shot mode
Regards, Rene
On Wed, 21 Sep 2005 17:32:40 +0200 Rene Wittmann (RW) wrote:
RW> When I check for timeslice overruns with global variables (set by preempter RW> and read by rt thread) RW> everything seems fine. RW> Whereas when I check for deadline misses via global variables, it doesn't RW> work.
From the example code you posted it doesn't look like you made the variable that is shared between the two threads "volatile". Only when that variable is volatile will the compiler read it from memory every time. Otherwise it is free to cache the variable in a register and may not see the write from the other thread.
But that's not the real problem here.
RW> My "real" code looks like this: RW> RW> extern char deadline_miss; RW> l4_kernel_clock_t left; RW> RW> set_up_preempter_etc(); RW> do_reservation_and_set_up(); RW> deadline_miss = 0; RW> while(1){ RW> if (deadline_miss == 0){ RW> l4_rt_next_period(); RW> }
The above part in pseudo-code:
1) read deadline miss variable 2) compare deadline miss variable against 0 3) if equal, call next_period and go to sleep until next period begins
If the deadline miss occurs somewhere after 2) and before 3) is done, then you have the situation you describe:
RW> /* if I check here for deadline_miss != 0 it's true!! */ RW> dealine_miss = 0;
RW> Summary: when I check for deadline misses right before next_period-call, RW> I have none. So I call next_period. If I check _again_ after the call, RW> I have one! RW> RW> How could I prevent this? Waiting for the next period is not pretended as RW> Udo stated correctly!
What you want is an atomic way of checking for a deadline miss and acting upon the result of that check. But such functionality does not exist. There is currently no way you can prevent the above scenario where the deadline miss occurs after the check but before (or while) you go to sleep via next_period.
So what you have to do instead is: upon detecting a deadline miss, the preempter thread should ex_regs the thread with the deadline miss (thereby cancelling the l4_rt_next_period/IPC operation). When you return from next_period, you can check if you've been ex_regs'd out of it or whether the call returned regularly at the beginning of your new period, e.g, by checking the global variable that you already have.
RW> Maybe my kernel settings are also of note here: APIC with one shot mode
That's a good choice.
-Udo.
At Wed, 21 Sep 2005 18:09:06 +0200, "Udo A. Steinberg" us15@os.inf.tu-dresden.de wrote:
From the example code you posted it doesn't look like you made the variable that is shared between the two threads "volatile". Only when that variable is volatile will the compiler read it from memory every time. Otherwise it is free to cache the variable in a register and may not see the write from the other thread.
I hate to distract from the real issue, but I should note that volatile does not do what you describe here. If you need to make sure that you see the write of another thread, you must use a memory barrier or another proper synchronization primitive. "volatile" is not the answer.
For an extensive analysis, please see for example sec. 5 in "C++ and the Perils of Double-Checked Locking":
http://www.aristeia.com/Papers/DDJ_Jul_Aug_2004_revised.pdf
Thanks, Marcus
On Wed, 21 Sep 2005 18:29:03 +0200 Marcus Brinkmann (MB) wrote:
MB> I hate to distract from the real issue, but I should note that MB> volatile does not do what you describe here. If you need to make sure MB> that you see the write of another thread, you must use a memory MB> barrier or another proper synchronization primitive. "volatile" is MB> not the answer.
I argue that you do NOT need any memory barrier in this case. Cache coherency ensures that as soon as CPU1 writes the relevant cache line with deadline_miss = 1, it goes invalid on CPU0 and the next read from CPU0 for deadline_miss will fetch the cache line from CPU1 and both CPUs go to shared.
You need a memory barrier to enforce program order between two loads, two stores or a load and a store (on a particular CPU) on architectures that can reorder such memory operations.
The example code does not rely on the order of reads or writes on a particular CPU. It only does:
deadline_miss = 0;
CPU0: CPU1: ----- ----- if (deadline_miss) deadline_miss = 1; do_something_about_it
MB> For an extensive analysis, please see for example sec. 5 in MB> "C++ and the Perils of Double-Checked Locking": MB> MB> http://www.aristeia.com/Papers/DDJ_Jul_Aug_2004_revised.pdf
This is an example where ordering must be enforced. What is described in this paper is similar to:
foo = inited = 0;
CPU0: CPU1: ----- ----- foo = 1; while (!inited); //wmb //rmb inited = 1; use foo;
Here you must prevent CPU0 from reordering the writes and CPU1 from reordering the reads.
-Udo.
At Tue, 22 Nov 2005 20:14:34 +0100, "Udo A. Steinberg" us15@os.inf.tu-dresden.de wrote:
On Wed, 21 Sep 2005 18:29:03 +0200 Marcus Brinkmann (MB) wrote:
MB> I hate to distract from the real issue, but I should note that MB> volatile does not do what you describe here. If you need to make sure MB> that you see the write of another thread, you must use a memory MB> barrier or another proper synchronization primitive. "volatile" is MB> not the answer.
I argue that you do NOT need any memory barrier in this case. Cache coherency ensures that as soon as CPU1 writes the relevant cache line with deadline_miss = 1, it goes invalid on CPU0 and the next read from CPU0 for deadline_miss will fetch the cache line from CPU1 and both CPUs go to shared.
You are right, I was having a knee-jerk reaction. Thanks for the correction!
Marcus
From the example code you posted it doesn't look like you made the variable that is shared between the two threads "volatile". Only when that variable is volatile will the compiler read it from memory every time. Otherwise it is free to cache the variable in a register and may not see the write from the other thread.
But that's not the real problem here.
OK, I changed my variables to volatile, but as you said: it's not really the problem.
What you want is an atomic way of checking for a deadline miss and acting upon the result of that check. But such functionality does not exist. There is currently no way you can prevent the above scenario where the deadline miss occurs after the check but before (or while) you go to sleep via next_period.
I think for l4_rt_next_reservation there is such a "check" in the kernel by passing the timeslice number you think you're on as argument to this call. Maybe one could include such functionality for l4_rt_next_period by passing the period you're waiting for as argument. This would give a quite useful improvement :-)
So what you have to do instead is: upon detecting a deadline miss, the preempter thread should ex_regs the thread with the deadline miss (thereby cancelling the l4_rt_next_period/IPC operation). When you return from next_period, you can check if you've been ex_regs'd out of it or whether the call returned regularly at the beginning of your new period, e.g, by checking the global variable that you already have.
But if I detect it right after returning from l4_rt_next_period, it's already too late. One could say: skip the next call, but this would in worst case lead to an execution time 50% longer than the expected one (skip every second l4_rt_next_period). That's a great pity!
Regards, Rene
On Fri, 23 Sep 2005 17:48:51 +0200 Rene Wittmann (RW) wrote:
RW> > What you want is an atomic way of checking for a deadline RW> > miss and acting upon the result of that check. But such RW> > functionality does not exist. There is currently no way you RW> > can prevent the above scenario where the deadline miss occurs RW> > after the check but before (or while) you go to sleep via next_period. RW> > RW> I think for l4_rt_next_reservation there is such a "check" in the kernel by RW> passing the timeslice number you think you're on as argument to this call. RW> Maybe one could include such functionality for l4_rt_next_period by passing RW> the period you're waiting for as argument. This would give a quite useful RW> improvement :-)
Note that l4_rt_next_reservation is built on top of l4_thread_switch, which provides ample register space in the L4 ABI for passing parameters like the timeslice number. However, l4_rt_next_period is built on top of l4_ipc and register space for argument passing is exhausted. To implement your suggestion, we would need a 64-bit register (or two 32-bit registers) available in order to pass the point in time of the next period to the kernel.
There are a number of ways to deal with this issue: 1) pass the additional parameter on the stack - this opens a can of worms wrt. the stack not being mapped and consequential page-faults. Yuck. 2) redefine the IPC registers such that the time parameter can be passed to the kernel - then we lose the ability to combine next_period with an IPC call, which is quite undesirable. 3) Move to an ABI with more register space, e.g., L4 X.2
Now I don't see 3) happen here anytime soon, so I don't know how to best fix this in a nice way.
RW> > So what you have to do instead is: upon detecting a deadline RW> > miss, the preempter thread should ex_regs the thread with the RW> > deadline miss (thereby cancelling the l4_rt_next_period/IPC RW> > operation). When you return from next_period, you can check RW> > if you've been ex_regs'd out of it or whether the call RW> > returned regularly at the beginning of your new period, e.g, RW> > by checking the global variable that you already have. RW> > RW> But if I detect it right after returning from l4_rt_next_period, it's RW> already RW> too late. One could say: skip the next call, but this would in worst case RW> lead to an execution time 50% longer than the expected one (skip every RW> second RW> l4_rt_next_period). That's a great pity!
When you cancel an ongoing IPC operation using l4_ex_regs, the operation will return with L4_IPC_RECANCELED or L4_IPC_SECANCELED, depending on whether the ex_regs happened during the send or receive phase. If the IPC waits for the beginning of the next period, then that "wait" should be cancelled as well. Therefore, your code could look like this:
error = l4_rt_next_period (...); if (error == L4_IPC_RECANCELED || error == L4_IPC_SECANCELED) { /* * Preempter aborted the next_period call upon deadline miss and I'm * now already in my next period (the one I originally wanted to wait * for using this next_period call). */ } else { /* * next_period succeeded. I waited until the next period and it has * now begun. The preempter didn't cancel the call so obviously no * deadline miss has occurred. */ }
This isn't as nice as the solution described above, but it does work with the L4 V.2 API, which Fiasco currently uses.
-Udo.
When you cancel an ongoing IPC operation using l4_ex_regs, the operation will return with L4_IPC_RECANCELED or L4_IPC_SECANCELED, depending on whether the ex_regs happened during the send or receive phase. If the IPC waits for the beginning of the next period, then that "wait" should be cancelled as well. Therefore, your code could look like this:
error = l4_rt_next_period (...); if (error == L4_IPC_RECANCELED || error == L4_IPC_SECANCELED) { /*
- Preempter aborted the next_period call upon deadline miss and I'm
- now already in my next period (the one I originally
wanted to wait
- for using this next_period call).
*/ } else { /*
- next_period succeeded. I waited until the next period and it has
- now begun. The preempter didn't cancel the call so obviously no
- deadline miss has occurred.
*/ }
So a call to l4_thread_ex_regs is sufficient and I would cancel the next_period-IPC with: l4_thread_ex_regs(main_thread_id, 0xFFFFFFFF, 0xFFFFFFFF, &id1, &id2, &w1, &w2, &w3)?? (l4_thread_id id1=id2=L4_INVALID_ID, l4_umword_t w1,w2,w3).
If it's that simple, it's fine!
Regards, Rene
On Mon, 26 Sep 2005 14:32:32 +0200 Rene Wittmann (RW) wrote:
RW> So a call to l4_thread_ex_regs is sufficient and I would cancel the RW> next_period-IPC RW> with: l4_thread_ex_regs(main_thread_id, 0xFFFFFFFF, 0xFFFFFFFF, &id1, &id2, RW> &w1, &w2, &w3)?? RW> (l4_thread_id id1=id2=L4_INVALID_ID, l4_umword_t w1,w2,w3). RW> RW> If it's that simple, it's fine!
Right. Except that in the current Fiasco implementation an IPC will be cancelled only if you explicitly set EIP to something other than 0xffffffff.
This seems like an unneeded restriction to me and I'm currently discussing with our group if this restriction can be removed. I'll send you a patch and commit the change to CVS in that case. Meanwhile you can mimic the desired behavior by setting the EIP of ex_regs to the instruction following the int $0x30 of the l4_next_period call, which is of course quite suboptimal.
-Udo.
On Mon, 26 Sep 2005 15:15:52 +0200 Udo A. Steinberg (UAS) wrote:
UAS> On Mon, 26 Sep 2005 14:32:32 +0200 Rene Wittmann (RW) wrote: UAS> UAS> RW> So a call to l4_thread_ex_regs is sufficient and I would cancel the UAS> RW> next_period-IPC UAS> RW> with: l4_thread_ex_regs(main_thread_id, 0xFFFFFFFF, 0xFFFFFFFF, &id1, UAS> RW> &id2, &w1, &w2, &w3)?? UAS> RW> (l4_thread_id id1=id2=L4_INVALID_ID, l4_umword_t w1,w2,w3). UAS> RW> UAS> RW> If it's that simple, it's fine! UAS> UAS> Right. Except that in the current Fiasco implementation an IPC will be UAS> cancelled only if you explicitly set EIP to something other than UAS> 0xffffffff.
Looks like we cannot change this restriction without breaking legacy L4 software. The original L4 behavior has always been:
read old EIP if (new_EIP != 0xffffffff) { set new_EIP cancel IPC }
Therefore, such a change would break existing software that sets EIP to 0xffffffff in order to only read the old EIP and expects IPC to continue.
UAS> This seems like an unneeded restriction to me and I'm currently discussing UAS> with our group if this restriction can be removed. I'll send you a patch UAS> and commit the change to CVS in that case. Meanwhile you can mimic the UAS> desired behavior by setting the EIP of ex_regs to the instruction UAS> following the int $0x30 of the l4_next_period call, which is of course UAS> quite suboptimal.
It also occurred to me that the ex_regs modification discussed above does not really prevent the race condition:
main thread: preempter thread:
deadline miss detected <--------- l4_ex_regs (abort IPC) l4_next_period();
The preempter can't tell whether the main thread is already blocking in its l4_next_period call. If the ex_regs happened too early (i.e., as shown above) it wouldn't have any effect.
So how about the following:
main thread: preempter thread:
going_to_wait_for_next_period = 1; if (going_to_wait_for_next_period) l4_next_period(); l4_ex_regs (main_thread, label); label: else going_to_wait_for_next_period = 0; main thread is obviously elsewhere do stuff for new period
The main thread flags via a shared variable to the preempter that it plans to go to sleep via next_period. If the preempter sees upon a deadline miss that the main thread is about to go to sleep, it l4_ex_regs'es it to a new EIP, namely the instruction following the l4_next_period call: if the main thread is already blocked in the IPC, then ex_regs will abort the IPC and set the new EIP to "label:". If not, new EIP will be set to "label:" as well and the whole block of code handling the wait until the next period will simply be skipped (make sure to keep your stack in shape).
-Udo.
l4-hackers@os.inf.tu-dresden.de