The ATS Blog

Linux 6.4: Concurrent I/O Performance Improvements
April 25, 2023

The Linux Device Mapper has long been a fundamental part of the Linux kernel, enabling users to define virtual block devices by mapping physical devices. However, as the need for high-performance concurrent I/O has become more pressing, certain limitations in the existing Device Mapper implementation have become apparent. As a result, Red Hat has recently developed an enhancement to improve Linux 6.4 concurrent I/O performance in situations where multiple threads access a thin device.

What is the Problem with Linux 6.4 Concurrent I/O Performance?

When multiple threads perform I/O to a thin device, the underlying dm_bufio object can act as a bottleneck, slowing down access to btree nodes that store thin metadata. In the previous implementation, each dm_bufio instance had a single mutex taken for every dm_bufio operation, leading to suboptimal performance.

How has Red Hat Improved Linux 6.4 Concurrent I/O Performance?

The Red Hat team has refactored the code, pulling out ‘lru’ (least recently used) and ‘buffer cache’ abstractions.

This new implementation updates the higher-level dm_bufio code to leverage abstractions and deals with the delicate locking requirements to provide finer-grained locking. As a result, the concurrent I/O performance has improved significantly.

Before this commit, dm_bufio used a global lru list to evict the oldest clean buffers from all clients. With the new locking approach, the team has replaced this with a do_global_cleanup() function that loops around the clients, asking them to free buffers older than a specific time. This change and converting many old BUG_ONs to WARN_ON_ONCE have improved concurrent I/O performance.

The performance results are promising.

While most dm_bufio operations have unchanged performance, the speed-up in cases where multiple threads attempt to get buffers concurrently, and the buffers are already in the cache, is significant.

For example, in one test, 16 ‘hotspot’ threads simulate btree lookups while another thread dirties the whole device, demonstrating that the hotspot threads acquired the buffers about 25 times faster.

What are the results of Red Hat’s Linux 6.4 Concurrent I/O Performance Improvements?

As a leading IT services company, The ATS Group provides cutting-edge solutions and tools, enabling businesses to save time and money. With the upcoming introduction of these kernel changes, which incorporates significant improvements to the Linux Device Mapper performance for concurrent I/O, I will leverage our proprietary tool, Galileo, to assess the impact of these enhancements on real-world infrastructure.

A word about Galileo for measuring Linux 6.4 concurrent I/O performance:

Galileo is a comprehensive enterprise-level monitoring solution that delivers health, connectivity, and capacity information at your fingertips for instant and simple visibility into your infrastructure and cloud.  Unlike any other in the industry, Galileo goes beyond standard monitoring solutions, allowing IT teams to see what is relevant, increase speed to resolution, anticipate and adapt to system usage needs, and lower operational costs.

a depiction of Galileo, the tool we're using to measure Linux 6.4 concurrent I/O performance

To measure the success of Red Hat’s Linux 6.4 Concurrent I/O Performance Improvements, I will use Galileo to measure I/O throughput, queueing, and latencies in this example.

By measuring throughput, latency, IOPS, and queueing – we can determine the level of real-world impact this change to the Linux Kernel will make.


The Red Hat team’s work on improving Linux Device Mapper performance for concurrent IO is a significant development. By refactoring the code and implementing new abstractions, they have achieved better concurrency and considerably enhanced performance in specific scenarios. This enhancement will be particularly beneficial for users relying on high-performance concurrent IO and contribute to optimizing the Linux kernel.

Are you prepared to benefit from the latest enhancements to the Linux Device Mapper, which significantly improve concurrent I/O performance?

It’s time to unlock your IT organization’s full potential by optimizing performance and capacity with Galileo, our cutting-edge IT monitoring tool. As Linux and other operating systems continue to evolve, it’s crucial to stay ahead by consistently measuring the performance of your infrastructure.

In today’s OPEX-driven world, having a deep understanding of your system’s performance is more important than ever. With Galileo, you’ll gain invaluable insights into crucial metrics such as I/O throughput, queueing, and latencies, enabling you to make informed decisions and effectively adapt to the ever-changing IT landscape. By leveraging the power of Galileo, you can ensure your business remains agile and efficient, keeping you ahead of the competition in an increasingly demanding technological environment.

Written by Andy Wojnarek
Andy is a seasoned IT professional with 18 years of experience in the industry. He currently serves as a Principal Solutions Architect, where he leverages his expertise to take the art of the possible and make it a reality for customers. Andy is highly skilled in monitoring and data analytics, with 10 years of experience managing thousands of hosts for clients and 5 years of experience managing a 3+ petabyte Splunk implementation. He holds a wide variety of IT certificates and is fluent in several programming languages, including Shell, Ruby, Golang, Lua, and Python. Andy has a proven track record as a technical leader and has successfully led large technical teams to satisfy customer requirements. He has also managed performance and capacity for massive parallel compute systems, including enterprise-wide capacity forecasting. He is passionate about his work and dedicated to delivering the best outcomes for his clients.