In on-line analytical processing (OLAP), precomputing (materializing as views)and indexing auxiliary data aggregations is a common way of reducing query-evaluation time(cost) for important data-analysis queries. We consider an OLAP view- and index-selection problem as an optimization problem, where (i) the input includes the data-warehouse schema, a set of data-analysis queries of interest, and a storage-limit constraint, and (ii) the output is a set of views and indexes that minimizes the total cost of evaluating the input queries, subject to the storage limit. While greedy and other heuristic strategies for choosingviews or indexes might have some success in reducing the cost, it is highly nontrivial toarrive at a globally optimal solution, one that reduces the processing cost of typical OLAPqueries as much as is theoretically possible.In this dissertation we present a systematic study of the OLAP view- and indexselectionproblem. Our specific contributions are: (1) we introduce an integer programming model for OLAP view- and index-selection problem; (2) we develop an algorithm that effectively and efficiently prunes the space of potentially beneficial views and indexes of the problem, and provide formal proofs that our pruning algorithm keeps at least one globally optimal solution in the search space, thus the resulting integer-programming model is guaranteed to find an optimal solution; this allows us to solve realistic-size instances of the problem within reasonable execution time. (3) we develop a family of algorithms to furtherreduce the size of the search space so that we are able to solve larger instances of theproblem, although we no longer guarantee global optimality of the resulting solution; and(4) we present an experimental comparison of our proposed approach with other approaches discussed in the open literature. Our experiments show that our proposed approach to view and index selection results in high-quality solutions — in fact, in the global optimal solutions for many realistic-size problem instances. Thus, it compares favorably with the well-known OLAP-centered approach of [13] and provides for a winning combination with the end-toend framework of [2] for generic view and index selection.
【 预 览 】
附件列表
Files
Size
Format
View
Exact and Inexact Methods for Selecting Views and Indexes for OLAP Performance Improvement