Issue |
JNWPU
Volume 42, Number 2, April 2024
|
|
---|---|---|
Page(s) | 303 - 309 | |
DOI | https://doi.org/10.1051/jnwpu/20244220303 | |
Published online | 30 May 2024 |
Research and implementation of asynchronous compaction mechanism of distributed database based on LSM-Tree
基于LSM-Tree的分布式数据库异步融合机制研究与实现
School of Computer Science, Northwestern Polytechnical University, Xi'an 710072, China
Received:
6
April
2023
With the continuous development of information technology, distributed database has become a research hotspot. Due to the limit support for SQL and defects in transaction processing and consistency of distributed databases based on NoSQL architecture, NewSQL databases based on LSM-Tree become gradually the mainstream of applications, such as TiDB and OceanBase. The distributed LSM-Tree storage architecture divides the data into baseline data and incremental data. Through the compaction operation, the incremental data of different partitions and the baseline data are continuously merged and stored on the disk, thereby reducing memory pressure. However, compaction will occupy a large amount of system resources and seriously affect system availability. This paper proposes an asynchronous compaction mechanism based on LSM-Tree architecture. By subdividing the compaction process, the data merging is asynchronous, which effectively shortens the time for a single compaction operation. Experiments show that the asynchronous compaction mechanism proposed in this paper can significantly shorten the data merging time and improve the robustness and usability of the system in high-frequency writing scenarios.
摘要
信息技术的不断发展, 使得分布式数据库成为研究热点。由于NoSQL架构的分布式数据库对SQL支持有限且在事务处理及一致性方面存在缺陷, 基于LSM-Tree的NewSQL数据库逐渐成为应用的主流, 例如TiDB、OceanBase等。分布式LSM-Tree的存储架构将数据分为基线数据与增量数据, 通过合并操作将不同分区的增量数据与基线数据不断融合, 并存储在磁盘, 从而减少内存压力。但合并会占用大量系统资源, 严重影响系统可用性。因此提出了一种基于LSM-Tree架构的异步融合机制, 通过细分合并流程, 将数据融合异步化, 有效地缩短了单次数据合并的时间。实验表明, 提出的异步融合机制可显著缩短数据合并时间, 提高系统在高频写入场景下的鲁棒性和可用性。
Key words: distributed database / LSM-Tree / data merging / asynchronous compaction / data partitioning
关键字 : 分布式数据库 / LSM-Tree / 数据合并 / 异步融合 / 数据分区
© 2024 Journal of Northwestern Polytechnical University. All rights reserved.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.