期刊论文详细信息
IEEE Access
A New Adaptive Coding Selection Method for Distributed Storage Systems
Wei Wei1  Bing Wei2  Li-Min Xiao2  Yao Song3  Bing-Yu Zhou4 
[1] School of Computer Science and Engineering, Xi&x2019;State Key Laboratory of Software Development Environment, School of Computer Science and Engineering, Beihang University, Beijing, China;an University of Technology, Xi&x2019;an, China;
关键词: Erasure codes;    access characteristics;    storage overhead;    reconstruction cost;   
DOI  :  10.1109/ACCESS.2018.2801265
来源: DOAJ
【 摘 要 】

Erasure codes, such as Reed-Solomon (RS) codes and local reconstruction codes (LRCs), are being increasingly adopted in distributed storage systems since they offer lower redundancy than data replication. While these codes significantly save storage space, they can incur large I/O overhead and network traffic in reconstructing unavailable data. Most existing storage systems use replication for hot data and an erasure code for warm and cold data, thereby achieving a good tradeoff between storage overhead and recovery performance. However, these storage systems do not take the access characteristics of data into account and tend to use only an erasure code, which hinders the possibility of reducing storage overhead and recovery cost. In this paper, we propose a new adaptive coding selection method that instead uses multiple LRCs for warm data. The LRCs are selected based on the access characteristics of the data. Each time a file is accessed, we assume that each of the involved data blocks is unavailable, in turn. It is necessary to calculate the I/O cost to recover unavailable blocks for different LRCs. The sum of the I/O costs for each LRC is calculated, and the LRC with the minimal I/O cost is selected for warm data. For cold data, we use an RS code that is optimized for storage overhead to reduce the storage burden. Our method is implemented on the top of the Hadoop distributed file system. Evaluations show that it reduces the storage overhead by up to 5% and the reconstruction traffic by up to 22%.

【 授权许可】

Unknown   

  文献评价指标  
  下载次数:0次 浏览次数:0次