学位论文详细信息
Cache-Oblivious Searching and Sorting in Multisets
Computer Science;Memory Hierarchies;Cache-Oblivious model;Multisets;Determining the mode;Duplicate Elimination;Sorting
Farzan, Arash
University of Waterloo
关键词: Computer Science;    Memory Hierarchies;    Cache-Oblivious model;    Multisets;    Determining the mode;    Duplicate Elimination;    Sorting;   
Others  :  https://uwspace.uwaterloo.ca/bitstream/10012/1019/1/afarzan2004.pdf
瑞士|英语
来源: UWSPACE Waterloo Institutional Repository
PDF
【 摘 要 】

We study three problems related to searching and sorting in multisets in the cache-oblivious model: Finding the most frequent element (the mode), duplicate elimination and finally multi-sorting. We are interested in minimizing the cache complexity (or number of cache misses) of algorithms for these problems in the context under which the cache size and block size are unknown.We start by showing the lower bounds in the comparison model. Then we present the lower bounds in the cache-aware model, which are also the lower bounds in the cache-oblivious model. We consider the input multiset of size N with multiplicities N1,..., Nk. The lower bound for the cache complexity of determining the mode isΩ({N over B} log {M over B} {N over fB}) where ƒ is the frequency of the mode and M, B are the cache size and block size respectively. Cache complexities of duplicate removal and multi-sorting have lower bounds of Ω({N over B} log {M over B} {N over B} - £{k over i}=1{Ni over B}log {M over B} {Ni over B}).We present two deterministic approaches to give algorithms: selection and distribution. The algorithms with these deterministic approaches differ from the lower bounds by at most an additive term of {N over B} loglog M. However, since loglog M is very small in real applications, the gap is tiny. Nevertheless, the ideas of our deterministic algorithms can be used to design cache-aware algorithms for these problems. The algorithms turn out to be simpler than the previously-known cache-aware algorithms for these problems.Another approach to design algorithms for these problems is the probabilistic approach. In contrast to the deterministic algorithms, our randomized cache-oblivious algorithms are all optimal and their cache complexities exactly match the lower bounds.All of our algorithms are within a constant factor of optimal in terms of the number of comparisons they perform.

【 预 览 】
附件列表
Files Size Format View
Cache-Oblivious Searching and Sorting in Multisets 319KB PDF download
  文献评价指标  
  下载次数:13次 浏览次数:37次