科技报告详细信息
Processing public pulsar astronomy data in the Amazon Cloud
Astronomical and Space Sciences not elsewhere classified
Toomey, Lawrence ; Benn, David ; Chapman, Jessica ; Dai, Shi ; Dempsey, James ; Hobbs, George ; Russell, Chris ; Wang, Chen ; Wang, Jingbo ; Zic, John
CSIRO
DOI  :  10.4225/08/594eb6c84850b
RP-ID  :  EP172634
学科分类:地球科学(综合)
澳大利亚|英语
来源: CSIRO Research Publications Repository
PDF
【 摘 要 】
The primary goal of the Amazon Web Services (AWS) and Square Kilometre Array (SKA) Organisation ``AstroCompute in the Cloud Grants Program'' is to develop the skills and techniques needed to create, store, process, and manage extremely large (Terabyte-scale) data sets.As reported here we have analysed how pulsar data sets can be stored and processed within the Amazon Cloud framework.We have processed various types of pulsar data sets and considered use-cases from simple processing required by an individual to the analysis of a huge data volume as part of a project team. We find that AWS infrastructure is ideal for the processing of high volumes of pulsar astronomy data, but also highlight some challenges that will need to be faced in the coming SKA era. Such challenges include data transfer issues, reproducibility of science results, software licensing issues, costs and having versatile processing packages that can easily be upgraded by individual users.AWS infrastructure provided all the functionality that we required, such as highly-configurable compute and storage, together with ease of deployment. However we ran intensive machine-learning algorithms on Graphics Processing Units (GPUs) on CSIRO's High Performance Computing system purely because it was more cost effective.We also found software that using graphical user interfaces for manual control and processing showed reduced performance on Amazon servers compared with a computer on the local network.This was not a major problem apart from the perspective of users in China who found the network speeds prohibitive (note that China is a member of the SKA).Another issue with the current AWS system is that the policy prohibits the export or publication of virtual machine (VM) images amended with the AWS infrastructure.It is common for astronomers to start with existing software, then develop that software, process their data and then wish to publish their amended VM image.All the team members enjoyed using the AWS systems and would seriously consider using AWS for both small-scale and large-scale pulsar processing projects in the future.We thank the AWS/SKA team for supporting this project and we hope that this report is found to be useful for future planning.
【 预 览 】
附件列表
Files Size Format View
EP172634.pdf 2501KB PDF download
  文献评价指标  
  下载次数:29次 浏览次数:57次