科技报告详细信息
Active Storage with Analytics Capabilities and I/O Runtime System for Petascale Systems
Choudhary, Alok1 
[1]Northwestern Univ., Evanston, IL (United States)
关键词: parallel I/O;    GPU;    FPGA;    High-performance computing;   
DOI  :  10.2172/1172904
RP-ID  :  DOE-NWU--25848
PID  :  OSTI ID: 1172904
学科分类:数学(综合)
美国|英语
来源: SciTech Connect
PDF
【 摘 要 】
Computational scientists must understand results from experimental, observational and computational simulation generated data to gain insights and perform knowledge discovery. As systems approach the petascale range, problems that were unimaginable a few years ago are within reach. With the increasing volume and complexity of data produced by ultra-scale simulations and high-throughput experiments, understanding the science is largely hampered by the lack of comprehensive I/O, storage, acceleration of data manipulation, analysis, and mining tools. Scientists require techniques, tools and infrastructure to facilitate better understanding of their data, in particular the ability to effectively perform complex data analysis, statistical analysis and knowledge discovery. The goal of this work is to enable more effective analysis of scientific datasets through the integration of enhancements in the I/O stack, from active storage support at the file system layer to MPI-IO and high-level I/O library layers. We propose to provide software components to accelerate data analytics, mining, I/O, and knowledge discovery for large-scale scientific applications, thereby increasing productivity of both scientists and the systems. Our approaches include 1) design the interfaces in high-level I/O libraries, such as parallel netCDF, for applications to activate data mining operations at the lower I/O layers; 2) Enhance MPI-IO runtime systems to incorporate the functionality developed as a part of the runtime system design; 3) Develop parallel data mining programs as part of runtime library for server-side file system in PVFS file system; and 4) Prototype an active storage cluster, which will utilize multicore CPUs, GPUs, and FPGAs to carry out the data mining workload.
【 预 览 】
附件列表
Files Size Format View
672KB PDF download
  文献评价指标  
  下载次数:14次 浏览次数:53次