科技报告

【摘要】

High capacity, high throughput, chunk-based inline deduplication systems for backup have been commercially successful, but scaling them out has proved challenging. In such multi-node systems, the data needs to be routed at a large enough granularity to sustain locality at the back ends. Two routing algorithms, Min Hash and Auction, have been put forth for this purpose. We demonstrate that these algorithms perform poorly on interleaved data. Interleaved data occurs when multiple streams are multiplexed into a single high-speed stream to speed up backups. Of particular commercial importance, database backup procedures produce such interleaved data, where multiple threads read database files in parallel. We present a new routing algorithm, Sticky Auction routing, that, unlike existing algorithms, handles interleaved data with little deduplication loss. It also achieves comparable or better deduplication performance for non-interleaved data and good load balancing, especially when multiple streams are used, the typical case.

【预览】

附件列表
Files	Size	Format	View
RO201804100000699LZ	764KB	PDF	download


Improving Multi-Node Deduplication Performance for Interleaved Data via Sticky-Auction Routing

Eshghi, Kave ; Lillibridge, Mark ; Bhagwat, Deepavali ; Watkins, Mark
HP Development Company
关键词: deduplication; routing; load-balancing;
RP-ID : HPL-2015-77
学科分类：计算机科学（综合）
美国\|英语
来源: HP Labs
PDF


	文献评价指标
	下载次数：37次	浏览次数：30次

【 摘 要 】

【 预 览 】

【摘要】

【预览】