After some local discussions I want to make some concluding remarks to this issue:
I see two possible solutions to the problem of missing parallelity in L4VFS / flips.
1. Solve it as it is done in "term_server", i.e., use one distributor thread receiving and replying to all request in flips. This thread simply forwards all work to a set of worker threads. This feature is also supported by our IDL compiler DICE. The distributor thread would then never block, only accept request and send answers.
2. L4VFS supports a kind of session management (the connection interface): If a task contacts a server for the first time it may ask the server for a new session and use the answer (a thread id) in the following. This can be used for load balancing etc. Currently no server implements this (as far as I know) and the client-side is not active (it is commented out in the source code). You could fix this at: l4vfs/lib/libc_backends/file_table/volumes.c:vol_resolve_thread_for_volume_id(). For your problem this would not yet be sufficient as you need several sessions per task, that is, you need one per local thread of your client. This means you would have to extend this L4VFS mechanism to setup a new connection for each local thread, store this information locally, and use it for each remote function call. I would start with vol_resolve_thread_for_volume_id() and its equivalent from the socket_io-backend.
I personally would prefer to use the first attempt as I think it is more future-proof and maybe elegant, because the exposure of server internal thread structures is not the future.
Regards, Martin