期刊论文详细信息
International Journal of Computational Intelligence Systems
Materialized View Selection Based on Adaptive Genetic Algorithm and Its Implementation with Apache Hive
关键词: materialized view;    multi-dimensional lattice;    genetic algorithm;    cost model;    adaptive;    Apache Hive;   
DOI  :  10.1080/18756891.2015.1113744
来源: DOAJ
【 摘 要 】

Frequently accessed views in data warehouses are usually materialized in order to accelerate the speed of querying big data. However, the view materialization itself incurs huge costs. Moreover, some latest products of non-traditional data warehouse software, such as Apache Hive, still lack the support of ma- terialized views. In order to select the appropriate views to be materialized with the possible minimized cost, we propose a novel approach to the materialized view selection problem based on an adaptive ge- netic algorithm. We establish a cost model that integrates the query, maintenance and storage costs to evaluate the performance of approaches and measure the fitness of an individual in the genetic algorithm. In addition, we introduce the adjustable factors for crossover probability and mutation probability, allow- ing the genetic algorithm to run quickly and avoid premature convergence. We also conduct extensive experiments for its implementation with Apache Hive, which query and manage large datasets residing in distributed storage. Both the simulation results and experiments on Apache Hive show that the approx- imately optimal solution for selecting materialized views can be obtained effectively using the approach presented.

【 授权许可】

Unknown   

  文献评价指标  
  下载次数:0次 浏览次数:1次