科技报告详细信息
Accelerated Simulation of Air Pollution Using NVIDIA RAPIDS
Keller, Christoph A ; Clune, Thomas L ; Thompson, Matthew A ; Stroud, Matthew A ; Evans, Mat J ; Ronaghi, Zahra
关键词: ATMOSPHERIC CHEMISTRY;    COMPUTERIZED SIMULATION;    DECISION THEORY;    EARTH OBSERVING SYSTEM (EOS);    GEOS SATELLITES (ESA);    GLOBAL AIR POLLUTION;    MACHINE LEARNING;    MASSIVELY PARALLEL PROCESSORS;    REACTION KINETICS;    SUPERCOMPUTERS;    VEGETATION;   
RP-ID  :  GSFC-E-DAA-TN75239
美国|英语
来源: NASA Technical Reports Server
PDF
【 摘 要 】

Atmospheric chemistry models are a central tool to study and forecast the impact of air pollution on the environment, vegetation, and human health. However, the numerical simulation of chemical kinetics is computationally expensive due to the stiffness of the system of ordinary differential equations that describes atmospheric chemistry. Here we present an alternative approach to the computation of atmospheric chemistry based on machine learning. Our training data set is produced using the NASA Goddard Earth Observing System (GEOS) model with GEOS-Chem chemistry, run on the NASA Center for Climate Simulation (NCCS) Discover supercomputing cluster on 384 Intel Xeon Haswell cores. This model spends more than 50% of total run time on solving atmospheric chemistry. The data set contains as input features the air pollution concentrations before solving the differential equations, together with some key physical parameters such as temperature and sun intensity. As target variables we define the air pollution concentrations after solving the differential equations. Using Dask-cuDF and Dask-XGBoost on the NVIDIA RAPIDS platform on 8 Tesla V100 GPUs, we generate from this training set gradient boosted decision tree models that can reproduce the simulation of chemical kinetics. We do this on the NCCS Advanced Data Analytics Platform (ADAPT) science cloud environment. Our application takes full advantage of recent advances in Dask-XGBoost, such as multi-node and multi-GPU scaling for distributed training with large data sets. The increase in training data size enabled by this is critical to capture the full range of chemical environments encountered across the globe and all annual seasons.The boosted tree models offer good predictability and show many of the features of the full chemistry reference simulation. Further improvements can be achieved through mass balance considerations and by accounting for error correlations. We incorporate the boosted tree models into the GEOS reference model using XGBoost's C API. This enables a seamless integration of the GPU trained models into GEOS-Chem, which is written in Fortran and optimized for use in a massively parallel CPU environment. We show the benefits of this approach and discuss the potential speedup of this machine learning accelerated atmospheric chemistry model.

【 预 览 】
附件列表
Files Size Format View
20190033152.pdf 21642KB PDF download
  文献评价指标  
  下载次数:6次 浏览次数:10次