Journal of Computer Science | |
FROM DATA MINING AND KNOWLEDGE DISCOVERY TO BIG DATA ANALYTICS AND KNOWLEDGE EXTRACTION FOR APPLICATIONS IN SCIENCE | Science Publications | |
Subana Shanmuganathan1  | |
关键词: Unstructured Data; High Performance Computing; Data Science; | |
DOI : 10.3844/jcssp.2014.2658.2665 | |
学科分类:计算机科学(综合) | |
来源: Science Publications | |
【 摘 要 】
âData miningâ for âknowledge discovery in databasesâ and associated computational operations first introduced in the mid-1990 s can no longer cope with the analytical issues relating to the so-called âbig dataâ. The recent buzzword big data refers to large volumes of diverse, dynamic, complex, longitudinal and/or distributed data generated from instruments, sensors, Internet transactions, email, video, click streams, noisy, structured/unstructured and/or all other digital sources available today and in the future at speeds and on scales never seen before in human history. The big data also being described using 3 Vs, volume, variety and velocity (with an additional 4th V for âveracityâ and more recently with a 5th V for âvalueâ), requires a set of new technologies, such as high performance computing i.e., exascale, architectures (distributed or grid), algorithms (for data clustering and generating association rules), programming languages, automated and scalable software tools, to uncover hidden patterns, unknown correlations and other useful information lately referred to as âactionable knowledgeâ or âdata productsâ from the massive volumes of complex raw data. In view of the above facts, the paper gives an introduction to the synergistic challenges in âdata-intensiveâ science and âexascaleâ computing for resolving âbig data analyticsâ and âdata scienceâ issues in four main disciplines namely, computer science, computational science, statistics and mathematics. For the realisation of vital identified foundational aspects of an effective cyber infrastructure, basic problems need to be addressed adequately in the respective disciplines and are outlined. Finally, the paper looks at five scientific research projects that are urgently in need of high performance computing; this is in contrast to the earlier situations where private business enterprises were the drivers of better modern and faster technologies.
【 授权许可】
Unknown
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
RO201911300088399ZK.pdf | 268KB | download |