Well, I have done that (read papers). Cache utilization is very easy to fix. In fact when I did Mach VM overhauling (couple of years ago) it maybe took me one day to implement coloring.
Maybe I am missing something, but what has cache coloring to do with cache utilization? Or do you mean you could solve the cache problems of Mach by simply reserving a fraction of the cache for the kernel?
That's not what I tried to say. My English must be bad. Sorry. I tried to say that carefully selecting pages to be placed on the queue, gave Mach serious performance boost.
IPC issue was not discovered by l3/4 team, but rather was known issue for very long time. There are much older Utah and UW papers that address that.
The issue was known before. That was the reason for papers like "The increasing irrelevance of ipc performance" from Brian Bershad which essentially claimed: Since the ipc performance is so bad, applications found other ways to communicate with each other and therefore there is no reason to try to achieve better ipc performance.
I know about Brian's paper, but there other papers that addressed that. For instance there was UW paper where researchers have sucessfully (work was not finished then, but it could be extended even today) tried replacing Mach rpc with LRPC in an instance of colocated objects.
But did anyone provide a solution? Surely you can point us to the Utah and UW papers which describe ipc performance comparable to L3/L4 ipc performance and where published before 1993. The program committee of the 93 SOSP surely thought otherwise and the audience was "shocked" when Jochen presented its 10-fold increase in IPC performance (l3 compared to Mach). The first thing Brian Ford did was asking Jochen for a copy of L3 to be able to verify the results (It was quite funny since L3 had a really strange user land and was mostly in German so Jochen had to coach Brian through the installation process).
Well obvious solution to Mach's problems is critical path optimizations.
that since l4 took almost everything out of the kernel, memory management could have been evicted as well (like Eros micro kernel does), then you may need to get offended.
You don't mean that Eros allows user level processes to directly manipulate kernel page tables? You are kidding, aren't you?
Well, I was talking about "First class flexpage-based address spaces" paper presented in by Shapiro in(around) 2000.
Eros does even more memory management work inside the kernel then L4. If l4 catches a page fault it directly sends a message to the pager. Eros first tries to parse the capability tree to check whether there is a mapping present which isn't in its hardware page table and only if it finds no mapping it invokes the user level page fault handler.
If basic assertation is "microkernel is only to provide enough protection so that applications can provide abstractions", then resource allocation may be safely moved out of microkernel. Correct me when I am wrong.
L4 papers states that since there is no way to prove that Mach only causes 5% degradation statement is not valid.
There was a rumor and we tried to verify it. No-one working with Mach was able to substantiate this rumor. They simply refused to confirm it. So we stated: "We found no substantiation for the ``common knowledge'' that early Mach3.0-based Unix single-server implementations achieved a performance penalty of only 10% compared to bare Unix on the same hardware.". There is no paper, no tech report, nothing that we are aware of. Feel free to prove us wrong and point us to any document published before 1997.
I am not trying to prove somebody wrong (or right for that matter). I want to perform evaluation then report result to those who may be interested. I will make sure to share results with you upon availability.
I am not trying to start am flame war. Nor do I claim that L4 is worthless. I simply want to see if there is a way to validate specific statement.
Have you tried to redo our measurements? Just pick the a machine with the same configuration like we used for our measurements (a 133 MHz Pentium PC based on an ASUS P55TP4N motherboard using Intel's 430FX chipset, equipped with a 256KB pipeline-burst second-level cache and 64MB of 60ns Fast Page Mode RAM) and do them again. If you come up with different numbers then we we can start to discuss the differences.
http://www.ibiblio.org/pub/historic-linux/early-ports/mach-linux/intel/
I will try to redo you tests, now that I have resources. I may have to use different hardware, but in this case, I will note that in the report.
google will point you to the location of hbench...
Thanks.
We will see.
I could not agree with you more. And we shall see relatively soon too. Just give me a little time. Thank you for providing link to the needed files. Sincerely, IS.