v0.7: Networking in POSIX vs. io_uring 💍
To showcase the differences between different IO approaches, this release brings a batch-asynchronous echo server implementation on top of UDP, measuring the packet drop frequency, throughput, and latency for:
- ASIO
- POSIX
- io_uring
The numbers currently look like:
Running build_release/less_slow
Run on (6 X 4000.4 MHz CPU s)
CPU Caches:
L1 Data 48 KiB (x6)
L1 Instruction 32 KiB (x6)
L2 Unified 2048 KiB (x6)
L3 Unified 327680 KiB (x1)
Load Average: 0.93, 0.52, 0.47
----------------------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations UserCounters...
----------------------------------------------------------------------------------------------------------
rpc_libc/loopback/min_time:2.000/manual_time 5514 us 2298 us 509 bytes_per_second=45.3389Mi/s drop,%=0 items_per_second=46.427k/s max_packet_latency,us=55 mean_batch_latency,us=5.51403k mean_packet_latency,us=21.5392
rpc_uring55/loopback/min_time:2.000/manual_time 1630 us 1591 us 1727 bytes_per_second=153.366Mi/s drop,%=0 items_per_second=157.046k/s max_packet_latency,us=1.822k mean_batch_latency,us=1.63009k mean_packet_latency,us=6.36754
rpc_asio/loopback/min_time:2.000/manual_time 89058 us 878 us 28 bytes_per_second=2.80717Mi/s drop,%=12.9325 items_per_second=2.87454k/s max_packet_latency,us=916 mean_batch_latency,us=89.0576k mean_packet_latency,us=399.553The current example only uses the most basic io_uring features available with Linux kernel 5.5. In the next iterations (#30), we should extend it with the following functionality:
IORING_REGISTER_BUFFERS- since 5.1IORING_RECV_MULTISHOTorio_uring_prep_recvmsg_multishot- since 6.0IORING_OP_SEND_ZCorio_uring_prep_sendmsg_zc- since 6.0IORING_SETUP_SQPOLL- withIORING_FEAT_SQPOLL_NONFIXEDafter 5.11IORING_SETUP_SUBMIT_ALL- since 5.18IORING_SETUP_COOP_TASKRUN- since 5.19IORING_SETUP_SINGLE_ISSUER- since 6.0
Feel free to join the development 🤗
Minor
- Add: io_uring variant for kernel 6.0 (ce73aa3)
- Add:
io_uringdraft (ec28b57) - Add: External route networking (a2a8c9e)
- Add: POSIX
echoimplementation (3cce3b9) - Add: ASIO "echo" server/client ping-pong (08d3326)
Patch
- Fix: Depend io_uring compilation on kernel version (70c53f6)
- Improve:
IOSQE_FIXED_FILEfor kernel 6.0+ (f7f7693) - Improve: ASIO benchmarks (6be216a)
- Docs: Refactor spell-checks (24706a7)
- Make: Order spell-checks (1358a69)
- Docs: Recommend OpenBLAS (035e388)
- Improve: Avoid
std::formatin io_uring (1857a82) - Fix:
ARCH_ENABLE_TAGGED_ADDRneeds Linux 6.2+ (3993b0c) - Fix: Missing
openblas_set_num_threads(3cab87d) - Docs: Instal libBLAS (7629609)
- Improve:
SO_ZEROCOPY(fd4c9e2) - Improve: Retrofit registering buffers in 5.5 (95de751)
- Make:
RelWithDebInfoflags (2838fd5) - Improve: Code styling on Windows (c9238a1)
- Fix: Avoid in-place increment (2c25b4d)
- Make: Disable CUDA by default (94879fd)
- Make: Matching
VERSIONin CMake (b4dc186) - Improve: Detect Linux version (f3e91fa)
- Improve:
physical_coresfor Windows refactor (0eb985c) - Docs: Future io_uring tasks (53c4ca6)
- Improve:
io_uringoptional timeouts (f933582) - Make: Revert to default BLAS (b3e13dd)
- Improve:
io_uringserver logic (3dfe612) - Fix:
liburingexample (a2a9d6c) - Improve: Reuse benchmarking logic (cae4175)
- Improve: Manual IO timing (fc60bfd)
- Make: Switch to
PkgConfigforliburing(b4e50ad) - Fix: Compiling
asioexample (b041392) - Make: Tag dependencies, where possible (8f2e985)
- Improve: Batching client/server requests (955be1d)
- Make:
liburing&asiodeps (7256f98)