posted on 2020-08-28, 09:21authored byWen Cheng, Tao Guo, Lingfang Zeng, Yang Wang, Lars NagelLars Nagel, Tim Süß, André Brinkmann
LSM-trie-based key-value (KV) store is often used to manage an ultralarge dataset in reality by introducing a number of sublevels at each level, its linear growth pattern can fairly reduce the write amplification in store operations. Although this design is effective for the write operation, the last level holds a large proportion of KV items, leading to the extreme imbalance of data distribution. Therefore, to support efficient read, we need to carefully consider this imbalance. On the other hand, to ensure that acquired data is latest, the LSM-trie needs to search the dataset at different levels one by one, and this search method may take a lot of unnecessary time. When the number of items is ultralarge, the random lookup performance may be poor due to the imbalance data distribution. To address this issue, we improve the read performance of the LSM-trie by changing its serial search to parallel search, using two threads to simultaneously search at the last level and other levels, respectively. Our experiment results show that the read performance of the LSM-trie can be improved up to 98.35% and on average 71.55%.
Funding
National Natural Science Foundation of China. Grant Number: 61472153, 61672513, 61572377.
National Science Foundation of Hubei Province. Grant Number: 2015CFB192.
Science and Technology Planning Project of Guangdong Province. Grant Number: 2019B010137002.
the Key‐Area Research and Development Program of Guangdong Province. Grant Number: 2020B010164002.
This is the peer reviewed version of the following article: CHENG, W. ... et al, 2020. Improving LSM‐trie performance by parallel search. Software: Practice and Experience, 50 (10), pp.1952-1965, which has been published in final form at https://doi.org/10.1002/spe.2875. This article may be used for non-commercial purposes in accordance with Wiley Terms and Conditions for Use of Self-Archived Versions.