it’s basically shared memory and lock-free queues. While that helps a lot with latency, we have been working on these topics for almost eight years, and there are a ton of things one can do wrong. For comparison, the first incarnation of iceoryx has a latency of around 1 microsecond in polling mode and with iceoryx2, we achieve 100 nanoseconds on some systems.
The payload size is always 8 bytes since we only push memory offsets to a shared memory segment.
The trick is to reduce contention as much as possible and have cache locality. With iceoryx classic, we used MPMC queues to support multiple publisher for the same topic and used reference counting across process boundaries to free the memory chunks once they were no longer used. With iceoryx2, we moved to SPSC queues, mainly to improve robustness, and solved the multi-publishing problem differently. Instead of reference counting across process boundaries for lifetime handling of the memory, we use SPSC completion queues to send the freed data back to the producer process. This massively reduced memory contention and made the whole transport mechanism simpler. There is a ton of other stuff going on to make all of this safe and also to be able to recover memory from crashed applications.
Hi KRAW,
it’s basically shared memory and lock-free queues. While that helps a lot with latency, we have been working on these topics for almost eight years, and there are a ton of things one can do wrong. For comparison, the first incarnation of iceoryx has a latency of around 1 microsecond in polling mode and with iceoryx2, we achieve 100 nanoseconds on some systems.
The payload size is always 8 bytes since we only push memory offsets to a shared memory segment.
The trick is to reduce contention as much as possible and have cache locality. With iceoryx classic, we used MPMC queues to support multiple publisher for the same topic and used reference counting across process boundaries to free the memory chunks once they were no longer used. With iceoryx2, we moved to SPSC queues, mainly to improve robustness, and solved the multi-publishing problem differently. Instead of reference counting across process boundaries for lifetime handling of the memory, we use SPSC completion queues to send the freed data back to the producer process. This massively reduced memory contention and made the whole transport mechanism simpler. There is a ton of other stuff going on to make all of this safe and also to be able to recover memory from crashed applications.