Fiasco.OC performance issues
adam at os.inf.tu-dresden.de
Tue Jan 29 00:09:38 CET 2013
On Fri Jan 25, 2013 at 00:00:26 +0100, Sebastian Sumpf wrote:
> On 01/17/2013 11:31 PM, Adam Lackorzynski wrote:
> > On Thu Jan 17, 2013 at 17:03:36 +0100, Sebastian Sumpf wrote:
> >> I recently upgraded Fiasco.OC to SVN revision 42 and experience some
> >> pretty severe performance degradation compared to revision 40 on the
> >> Pandaboard (SMP). It seems that 'simga0' and the root task stall for 5
> >> to 10 seconds during boot up. I tracked the issue down to be caused by
> >> the initial mapping operations, especially our root task maps all the
> >> available memory during bootstrap. Within the kernel the
> >> 'Context::xcpu_tlb_flush' is called for each mapping. The function sends
> >> an IPI (to CPU1 which is idle) and then waits for an IPI in order to
> >> signal the end of the operation. The whole operation seems to have
> >> gotten slower compared to revision 40, but I could not find many
> >> differences in the IPI-handling code. Do you have any ideas or
> >> suggestions what could cause the delay (maybe scheduling changes) and
> >> how to fix it?
> > I noticed a similar/same thing but hadn't time to investigate yet.
> Okay, I just wanted to make sure that the problem is not at our side nor
> at our usage pattern.
> Another thing I wonder is: Since you now have second level cache support
> for the PandaBoard, how do I map DMA memory to a client? The problem
> seems to be that sigma0 maps all memory as cached. So what we have been
> trying to do is this: When someone requests DMA memory we map the page
> as uncached and then call 'l4_cache_dma_coherent' afterwards. This
> doesn't seem to work out well for our drivers. The thing I think I could
> gather is that memory that is mapped cached (sigma0, roottask) and
> uncached (client) at the same time has an undefined behavior (I might be
> wrong here) on ARM. So, what is the protocol to implement this on
> Fiasco.OC/L4RE setups?
Indeed, having memory with different attributes must be avoided.
But it's also about accessing that memory. So for example for sigma0
this isn't a problem because sigma0 does not touch the memory itself.
Is your roottask accessing the memory, i.e. pulling it into caches?
Adam adam at os.inf.tu-dresden.de
More information about the l4-hackers