Reading and writing big data is increasingly becoming a major bottleneck of using high-performance computing systems as we are heading towards the Exascale era. An unprecedented amount of data is being produced everydayby different sources. On the other hand, the computation power of HPCsystems is getting scaled to hundreds of thousands cores. However, for an application to be able to utilize this much data and computation power, using I/O effectively is a must. One of the fields dealing with huge amount of data is geographic information science. In this thesis, we have implemented a parallel I/O library specialized for spatial data analysis in GIScience, capable of treating different I/O patterns such as Row-Wise, Column-Wise and Block-Wise I/O. We then establish an auto-tuning framework for finding optimal parallel I/O configurations. This auto-tuning framework is based on geneticalgorithm and works on a range of configurations from the parallel file system all the way up to spatial data-analysis applications. The results and findings of a set of I/O intensive experimentsexecuted on large HPC systems are also presented to demonstrate the effectiveness of the framework.
【 预 览 】
附件列表
Files
Size
Format
View
Auto-tuned optimized parallel I/O for GIScience and spatial applications