期刊论文详细信息
IEEE Access
A Parallel Genetic Algorithm With Dispersion Correction for HW/SW Partitioning on Multi-Core CPU and Many-Core GPU
Yi Zhou1  Yilin Chen2  Fazhi He2  Neng Hou2  Xiaohu Yan2 
[1] Engineering Research Center of Metallurgical Automation and Measurement Technology, School of Information Science and Engineering, Wuhan University of Science and Technology, Wuhan, China;State Key Laboratory of Software Engineering, School of Computer Science, Wuhan University, Wuhan, China;
关键词: Hardware/software co-design;    heuristic method;    genetic algorithm;    multi-core CPU;    many-core GPU;   
DOI  :  10.1109/ACCESS.2017.2776295
来源: DOAJ
【 摘 要 】

In hardware/software (HW/SW) co-design, hardware/software partitioning is an essential step in that it determines which components to be implemented in hardware and which ones in software. Most of HW/SW partitioning problems are NP hard. For large-size problems, heuristic methods have to be utilized. This paper presents a parallel genetic algorithm with dispersion correction for HW/SW partitioning on CPU-GPU. First, an enhanced genetic algorithm with dispersion correction is presented. The underconstraint individuals are marched to feasible region step by step. In this way, the intensification can be enhanced as well as the constraint problem can be handled. Second, the individuals performing costs computation and dispersion correction are run in parallel. For a given problem size, the overall run-time can be reduced while the diversity of genetic algorithm can be kept. Third, especially when a number of under-constraint individuals should be corrected in an irregular way, the computation process is complicated and the computation overhead is large. Therefore, we present a novel parallel strategy by leveraging the parallel power of a multi-core CPU and that of a many-core GPU. The proposed strategy computes the costs of each individual in parallel on GPU and corrects the under-constraint individuals in parallel on the multi-core CPU. In this way, a highly efficient parallel computing can be achieved in which dozens of irregular correction computing steps are mapped to the multi-core CPU and thousands of regular cost computing steps are mapped to the many-core GPU. Fourth, at each iteration of the hybrid parallel strategy, the solution vectors of individuals are transferred to the GPU and their costs are transferred back to the CPU. In order to further improve the efficiency of proposed algorithm, we propose an asynchronous transfer pattern (stream concurrency pattern) for CPU-GPU, in which the transfer process and computation process are overlapped and eventually the overall run-time can be reduced further. Finally, the experiments show that the solution quality obtained by our method is competitive with existing heuristic methods in reasonable time. Furthermore, by combining with the multi-core CPU and many-core GPU, the running time of the proposed method is efficiently reduced.

【 授权许可】

Unknown   

  文献评价指标  
  下载次数:0次 浏览次数:0次