The use-case is multiple threads trying to submit operations to an io_uring SQ
ring, using a rwlock, a local sq_tail
to be synchronized with the tail shared
with the kernel (sq_ktail
).
Maybe we can submit to the SQ ring from multiple threads with reduced contentions
using a rwlock instead of a mutex. The read lock would protect incrementing the
local sq_tail
(updated with a CAS) and preparing the SQE, while the write lock
would protect updates to the sq_ktail
shared with the kernel (updated with a