Hi All,
Now I am testing the response time in Fiasco, especially for the worse case. The results are strange. The worse case result are always around 2us. I have filtered out the timer interrupt, so I cannot understand what happened. I am using intel IA32 multi core cpu. Do you have any idea about what can cause the 2us delay?
Thank you so much. Yuxin
On Tue Aug 05, 2014 at 13:03:51 -0400, Yuxin Ren wrote:
Now I am testing the response time in Fiasco, especially for the worse case. The results are strange. The worse case result are always around 2us. I have filtered out the timer interrupt, so I cannot understand what happened. I am using intel IA32 multi core cpu. Do you have any idea about what can cause the 2us delay?
For worst case, 2µs does not seem bad to me for this type of CPU. How did you try to get the worst case (or do you mean something different with 'worse case'?)?
Adam
Hi,
Sorry, it's my typo. The worse case is 200us. In my test, before the client sends IPC, I record the time using rdtsc. When the server receives this IPC, I also do rdtsc. The difference between this is the response time. I run this many iterations and the maximum value is the worse case result. I filter out the timer interrupt, but do not flush the cache and tlb.
Thank you very much, Yuixn
On Tue, Aug 5, 2014 at 4:41 PM, Adam Lackorzynski <adam@os.inf.tu-dresden.de
wrote:
On Tue Aug 05, 2014 at 13:03:51 -0400, Yuxin Ren wrote:
Now I am testing the response time in Fiasco, especially for the worse
case.
The results are strange. The worse case result are always around 2us. I have filtered out the timer interrupt, so I cannot understand what happened. I am using intel IA32 multi core cpu. Do you have any idea about what can cause the 2us delay?
For worst case, 2µs does not seem bad to me for this type of CPU. How did you try to get the worst case (or do you mean something different with 'worse case'?)?
Adam
Adam adam@os.inf.tu-dresden.de Lackorzynski http://os.inf.tu-dresden.de/~adam/
l4-hackers mailing list l4-hackers@os.inf.tu-dresden.de http://os.inf.tu-dresden.de/mailman/listinfo/l4-hackers
On 6 Aug 2014, at 7:19 , Yuxin Ren ryx@gwmail.gwu.edu wrote:
Hi,
Sorry, it's my typo. The worse case is 200us. In my test, before the client sends IPC, I record the time using rdtsc. When the server receives this IPC, I also do rdtsc. The difference between this is the response time. I run this many iterations and the maximum value is the worse case result. I filter out the timer interrupt, but do not flush the cache and tlb.
A worst case of 200µs isn’t bad, if it’s really the worst case. The problem is that you can never be sure you caught the worst case, and most likely you didn’t. When we started analysing seL4, we found pathological cases where the worst case (on an otherwise very fast kernel) was about a second. These bad cases typically happen when dismantling complex data structures, eg revoking derived capabilities. You probably didn’t test that.
For seL4 we found we could get the safe upper bound of the WCET to around 300µs (on ARM11), the real worst case is probably around half that (but you can’t be sure).
I think it should be possible to get it down to 10–20µs, but that would complicate the kernel code significantly (and, in seL4’s case, would be tough to verify).
Gernot
What do your MP numbers look like for WCET?
On 6 Aug 2014, at 9:45 am, "Gernot Heiser" gernot@cse.unsw.edu.au wrote:
On 6 Aug 2014, at 7:19 , Yuxin Ren ryx@gwmail.gwu.edu wrote:
Hi,
Sorry, it's my typo. The worse case is 200us. In my test, before the client sends IPC, I record the time using rdtsc. When the server receives this IPC, I also do rdtsc. The difference between this is the response time. I run this many iterations and the maximum value is the worse case result. I filter out the timer interrupt, but do not flush the cache and tlb.
A worst case of 200µs isn’t bad, if it’s really the worst case. The problem is that you can never be sure you caught the worst case, and most likely you didn’t. When we started analysing seL4, we found pathological cases where the worst case (on an otherwise very fast kernel) was about a second. These bad cases typically happen when dismantling complex data structures, eg revoking derived capabilities. You probably didn’t test that.
For seL4 we found we could get the safe upper bound of the WCET to around 300µs (on ARM11), the real worst case is probably around half that (but you can’t be sure).
I think it should be possible to get it down to 10–20µs, but that would complicate the kernel code significantly (and, in seL4’s case, would be tough to verify).
Gernot _______________________________________________ l4-hackers mailing list l4-hackers@os.inf.tu-dresden.de http://os.inf.tu-dresden.de/mailman/listinfo/l4-hackers
MP as in multiprocessor seL4? There’s no released multiprocessor version of seL4.
Gernot
On 6 Aug 2014, at 10:18 , Daniel Potts danielp@ok-labs.com wrote:
What do your MP numbers look like for WCET?
On 6 Aug 2014, at 9:45 am, "Gernot Heiser" gernot@cse.unsw.edu.au wrote:
On 6 Aug 2014, at 7:19 , Yuxin Ren ryx@gwmail.gwu.edu wrote:
Hi,
Sorry, it's my typo. The worse case is 200us. In my test, before the client sends IPC, I record the time using rdtsc. When the server receives this IPC, I also do rdtsc. The difference between this is the response time. I run this many iterations and the maximum value is the worse case result. I filter out the timer interrupt, but do not flush the cache and tlb.
A worst case of 200µs isn’t bad, if it’s really the worst case. The problem is that you can never be sure you caught the worst case, and most likely you didn’t. When we started analysing seL4, we found pathological cases where the worst case (on an otherwise very fast kernel) was about a second. These bad cases typically happen when dismantling complex data structures, eg revoking derived capabilities. You probably didn’t test that.
For seL4 we found we could get the safe upper bound of the WCET to around 300µs (on ARM11), the real worst case is probably around half that (but you can’t be sure).
I think it should be possible to get it down to 10–20µs, but that would complicate the kernel code significantly (and, in seL4’s case, would be tough to verify).
Gernot _______________________________________________ l4-hackers mailing list l4-hackers@os.inf.tu-dresden.de http://os.inf.tu-dresden.de/mailman/listinfo/l4-hackers
On Tue Aug 05, 2014 at 17:19:09 -0400, Yuxin Ren wrote:
Sorry, it's my typo. The worse case is 200us.
What is min and avg?
In my test, before the client sends IPC, I record the time using rdtsc. When the server receives this IPC, I also do rdtsc. The difference between this is the response time. I run this many iterations and the maximum value is the worse case result. I filter out the timer interrupt, but do not flush the cache and tlb.
Are server and client running on the same or different cores?
Adam
I did not measure min value. The avg is around 5000 cycs, and my machine is 2.8GHz.
Client and server are in different core.
On Wed, Aug 6, 2014 at 5:42 PM, Adam Lackorzynski <adam@os.inf.tu-dresden.de
wrote:
On Tue Aug 05, 2014 at 17:19:09 -0400, Yuxin Ren wrote:
Sorry, it's my typo. The worse case is 200us.
What is min and avg?
In my test, before the client sends IPC, I record the time using rdtsc. When the server receives this IPC, I also do rdtsc. The difference between this is the response time. I run this many iterations and the maximum value is the worse case result. I filter out the timer interrupt, but do not flush the cache and tlb.
Are server and client running on the same or different cores?
Adam
Adam adam@os.inf.tu-dresden.de Lackorzynski http://os.inf.tu-dresden.de/~adam/
l4-hackers mailing list l4-hackers@os.inf.tu-dresden.de http://os.inf.tu-dresden.de/mailman/listinfo/l4-hackers
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 06.08.2014 23:51, Yuxin Ren wrote:
I did not measure min value. The avg is around 5000 cycs, and my machine is 2.8GHz.
Client and server are in different core.
- - And are these cores on the same socket or on different ones?
- - Are the timestamp counters between these cores synchronized? => To be safe you might want to measure the roundtrip time from client to server and back as then you use the same hardware TSC.
- - Did you disable the usual suspects for System Management Mode interruptions (for instance Legacy USB Support, which is often provided as a feature by your BIOS)?
Bjoern
- -- Dipl.-Inf. Bjoern Doebel Mail: doebel@tudos.org TU Dresden, OS Chair Phone: +49 351 463 38 799 Noethnitzer Str. 46 Fax: +49 351 463 38 284 01187 Dresden, Germany WWW: http://www.tudos.org/~doebel - -- "When the seagulls follow the trawler, it's because they think sardines will be thrown into the sea." (Eric Cantona)
l4-hackers@os.inf.tu-dresden.de