Active Storage with Analytics Capabilities and I/O Runtime System for Petascale Systems | |
Choudhary, Alok1  | |
[1] Northwestern Univ., Evanston, IL (United States) | |
关键词: parallel I/O; GPU; FPGA; High-performance computing; | |
DOI : 10.2172/1172904 RP-ID : DOE-NWU--25848 PID : OSTI ID: 1172904 |
|
学科分类:数学(综合) | |
美国|英语 | |
来源: SciTech Connect | |
【 摘 要 】
Computational scientists must understand results from experimental, observational and computational simulation generated data to gain insights and perform knowledge discovery. As systems approach the petascale range, problems that were unimaginable a few years ago are within reach. With the increasing volume and complexity of data produced by ultra-scale simulations and high-throughput experiments, understanding the science is largely hampered by the lack of comprehensive I/O, storage, acceleration of data manipulation, analysis, and mining tools. Scientists require techniques, tools and infrastructure to facilitate better understanding of their data, in particular the ability to effectively perform complex data analysis, statistical analysis and knowledge discovery. The goal of this work is to enable more effective analysis of scientific datasets through the integration of enhancements in the I/O stack, from active storage support at the file system layer to MPI-IO and high-level I/O library layers. We propose to provide software components to accelerate data analytics, mining, I/O, and knowledge discovery for large-scale scientific applications, thereby increasing productivity of both scientists and the systems. Our approaches include 1) design the interfaces in high-level I/O libraries, such as parallel netCDF, for applications to activate data mining operations at the lower I/O layers; 2) Enhance MPI-IO runtime systems to incorporate the functionality developed as a part of the runtime system design; 3) Develop parallel data mining programs as part of runtime library for server-side file system in PVFS file system; and 4) Prototype an active storage cluster, which will utilize multicore CPUs, GPUs, and FPGAs to carry out the data mining workload.
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
672KB | download |