科技报告详细信息
Improving Multi-Node Deduplication Performance for Interleaved Data via Sticky-Auction Routing
Eshghi, Kave ; Lillibridge, Mark ; Bhagwat, Deepavali ; Watkins, Mark
HP Development Company
关键词: deduplication;    routing;    load-balancing;   
RP-ID  :  HPL-2015-77
学科分类:计算机科学(综合)
美国|英语
来源: HP Labs
PDF
【 摘 要 】

High capacity, high throughput, chunk-based inline deduplication systems for backup have been commercially successful, but scaling them out has proved challenging. In such multi-node systems, the data needs to be routed at a large enough granularity to sustain locality at the back ends. Two routing algorithms, Min Hash and Auction, have been put forth for this purpose. We demonstrate that these algorithms perform poorly on interleaved data. Interleaved data occurs when multiple streams are multiplexed into a single high-speed stream to speed up backups. Of particular commercial importance, database backup procedures produce such interleaved data, where multiple threads read database files in parallel. We present a new routing algorithm, Sticky Auction routing, that, unlike existing algorithms, handles interleaved data with little deduplication loss. It also achieves comparable or better deduplication performance for non-interleaved data and good load balancing, especially when multiple streams are used, the typical case.

【 预 览 】
附件列表
Files Size Format View
RO201804100000699LZ 764KB PDF download
  文献评价指标  
  下载次数:29次 浏览次数:30次