On Thursday, 29 September 2022 13:55:49 CEST Philipp Eppelt wrote:
Hi Paul,
I believe I have an idea of what you are observing.
00 00 00 00 ce e3 08 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ...
opcode ??? offset hot_spot
On a 32bit architecture, the opcode is 32bit, the ??? is 32bit padding before the offset, which is - also on a 32bit architecture - a 64bit value and needs to be aligned to 64bit.
Right. I was in the process of writing a reply to your last message - thank you for sending that! - when I started to reconsider the way my own IPC framework operates, and I realised that I had made an embarrassing mistake with regard to structure member alignment, since I use structures to access message/call parameters. As you note, there are alignment constraints for structure members...
The client has to adhere to the platform specific alignment constraints of the members in the data structure, as does the server side.
...these being that even on a 32-bit platform, any 64-bit members will be aligned to 64-bit boundaries, thus causing padding to be inserted after any members preceding them if those members do not occupy the space leading up to the 64-bit boundary. This is presumably dictated by the "alignment requirement" for each type mentioned in this document:
https://en.cppreference.com/w/c/language/object
I had been assuming that the members would be word-aligned. However, since most of my IPC messaging was between peers that interpreted messages in the same way, this was not a general problem.
Now, it is interesting that you mention the alignment requirements in the context of IPC messages prepared by the L4Re RPC framework (found in pkg/l4re- core/l4sys/include/cxx). I would have expected the framework to be serialising the values and filling up message registers, as opposed to treating the message registers as a structure, just from briefly looking at it (and also considering the other IPC mechanisms in L4Re).
From what I have understood, you are implementing a dataspace provider and are observing the provider generating the error (-39/-L4_EBADPROTO)?
EBADPROTO normaly means that the protocol value in the l4_msgtag_t is not supported by the server. In case of an unsupported opcode servers reports L4_ENOSYS. Thus, I'm a bit confused about the EBADPROTO error code and the opcode/MR issue. What am I missing?
So, EBADPROTO would be generated by my server code upon receiving an opcode it doesn't understand. I suppose I am using the wrong error in this case: there's so much to learn about the conventions involved.
Going back to the original problem (on a 64-bit system), I did indeed get a "corrupt" opcode that caused my server code to return EBADPROTO. On a 32-bit system, what happens instead is that the opcode can be interpreted as a machine word (l4_umword_t), but that the structure padding displaces the parameters.
I suppose what I have been clumsily trying to clarify (and dragging you into this) is what the alignment issues are for message parameters. Maybe I should have been reading some kind of ABI documentation, and I can certainly understand that alignment constraints would apply when treating message parameters like normal function call parameters, although that is also in the realm of a platform's ABI documentation.
Thanks once again for following up, and sorry to be a nuisance!
Paul