Hi Gianluca,
did you also install (identy-mapped) page tables? Just enabled I and C bits in SCTLR will not be enough. Looks like the Application Note has all the code for that. I would not know of anything to be done differently here on the hypervisor level.
Adam
On Fri May 16, 2025 at 13:50:25 -0000, agaku03@gmail.com wrote:
Yes, the cache is enabled as I manually activate the SCTLR_EL1 registers with the enable values for I and C. I can see by debugging that some instructions are cached.
On analysis, I noticed that latencies seem to occur when accessing on stack or memory areas. I did a test using the assembly to rewrite the time calculation function and this method is considerably more agile (from 4650 ticks to 79 ticks).
Going deeper and disassembling the code with the for loop, I noticed that the difference is in the non-use of calls of type => ldr x0, [sp, #104], i.e. to the stack.
The stack is an area of memory set by the linker and used as described in the manual ( ARM DAI 0527A Non-Confidential - Application Note - Bare-metal Boot Code for ARMv8-A Processors).
Could there be something I need to manage on the hypervisor side or settings I need to make to optimise these exchanges?