Biology Direct | |
Bioinformatics clouds for big data manipulation | |
Lin Dai2  Xin Gao1  Yan Guo4  Jingfa Xiao3  Zhang Zhang3  | |
[1] Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia | |
[2] School of Computer Science and Technology, Beijing Institute of Technology, Beijing, 100081, China | |
[3] CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, No.7 Beitucheng West Road, Building G, Chaoyang District, Beijing, 100029, China | |
[4] Cloud Development and Cloud Solution Integration, IBM China Systems & Technology Lab, IBM Co. Ltd, Beijing, 100193, China | |
关键词: Data analysis; Data storage; Big data; Bioinformatics; Cloud computing; | |
Others : 795261 DOI : 10.1186/1745-6150-7-43 |
|
received in 2012-09-19, accepted in 2012-11-26, 发布年份 2012 | |
【 摘 要 】
As advances in life sciences and information technology bring profound influences on bioinformatics due to its interdisciplinary nature, bioinformatics is experiencing a new leap-forward from in-house computing infrastructure into utility-supplied cloud computing delivered over the Internet, in order to handle the vast quantities of biological data generated by high-throughput experimental technologies. Albeit relatively new, cloud computing promises to address big data storage and analysis issues in the bioinformatics field. Here we review extant cloud-based services in bioinformatics, classify them into Data as a Service (DaaS), Software as a Service (SaaS), Platform as a Service (PaaS), and Infrastructure as a Service (IaaS), and present our perspectives on the adoption of cloud computing in bioinformatics.
【 授权许可】
2012 Dai et al.; licensee BioMed Central Ltd.
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
20140705084025184.pdf | 319KB | download | |
Figure 1. | 127KB | Image | download |
【 图 表 】
Figure 1.
【 参考文献 】
- [1]Schatz MC, Langmead B, Salzberg SL: Cloud computing and the DNA data race. Nat Biotechnol 2010, 28(7):691-693.
- [2]Eisenstein M: Oxford Nanopore announcement sets sequencing sector abuzz. Nat Biotechnol 2012, 30(4):295-296.
- [3]Schadt EE, Linderman MD, Sorenson J, Lee L, Nolan GP: Cloud and heterogeneous computing solutions exist today for the emerging big data problems in biology. Nat Rev Genet 2011, 12(3):224.
- [4]Schadt EE, Linderman MD, Sorenson J, Lee L, Nolan GP: Computational solutions to large-scale data management and analysis. Nat Rev Genet 2010, 11(9):647-657.
- [5]Grossman RL, White KP: A vision for a biomedical cloud. J Intern Med 2012, 271(2):122-130.
- [6]Armbrust M, Fox A, Griffith R, Joseph AD, Katz RH, Konwinski A, Lee G, Patterson DA, Rabkin A, Stoica I, et al.: Above the Clouds: A Berkeley View of Cloud Computing. Berkeley: EECS Department, University of California; 2009.
- [7]Garfinkel SL: Architects of the Information Society: Thirty-Five Years of the Laboratory for Computer Science at MIT. Cambridge, MA: The MIT Press; 1999.
- [8]Buyya R, Yeo CS, Venugopal S, Broberg J, Brandic I: Cloud computing and emerging IT platforms: vision, hype, and reality for delivering computing as the 5th utility. Future Gener Comp Sy 2009, 25(6):599-616.
- [9]Dudley JT, Butte AJ: In silico research in the era of cloud computing. Nat Biotechnol 2010, 28(11):1181-1185.
- [10]Stein LD: The case for cloud computing in genome informatics. Genome Biol 2010, 11(5):207. BioMed Central Full Text
- [11]Taylor RC: An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics. BMC Bioinformatics 2010, 11(Suppl 12):S1. BioMed Central Full Text
- [12]Stanoevska-Slabeva K, Wozniak T: Cloud Basics - An Introduction to Cloud Computing. In Grid and Cloud Computing: Business Perspective on Technology and Applications. Edited by Stanoevska K, Wozniak T, Ristol S. Berlin: Springer; 2010:47-61.
- [13]Truong HL, Dustdar S: On Analyzing and Specifying Concerns for Data as a Service. 2009 Ieee Asia-Pacific Services Computing Conference (Apscc 2009) 2009, 83-90.
- [14]DaaS: The New Information Goldmine. http://online.wsj.com/article/SB125071202052143965.html webcite
- [15]Fusaro VA, Patil P, Gafni E, Wall DP, Tonellato PJ: Biomedical cloud computing with Amazon Web Services. PLoS Comput Biol 2011, 7(8):e1002147.
- [16]Nguyen T, Shi W, Ruden D: CloudAligner: a fast and full-featured MapReduce based tool for sequence mapping. BMC Res Notes 2011, 4:171. BioMed Central Full Text
- [17]Schatz MC: CloudBurst: highly sensitive read mapping with MapReduce. Bioinformatics 2009, 25(11):1363-1369.
- [18]Langmead B, Schatz MC, Lin J, Pop M, Salzberg SL: Searching for SNPs with cloud computing. Genome Biol 2009, 10(11):R134. BioMed Central Full Text
- [19]Matsunaga A, Tsugawa M, Fortes J: Combining MapReduce and Virtualization on Distributed Resources for Bioinformatics Applications. Fourth IEEE International Conference on eScience 2008, 222-229.
- [20]Hong D, Rhie A, Park SS, Lee J, Ju YS, Kim S, Yu SB, Bleazard T, Park HS, Rhee H, et al.: FX: an RNA-Seq analysis tool on the cloud. Bioinformatics 2012, 28(5):721-723.
- [21]Langmead B, Hansen KD, Leek JT: Cloud-scale RNA-sequencing differential expression analysis with Myrna. Genome Biol 2010, 11(8):R83. BioMed Central Full Text
- [22]Zhang L, Gu S, Liu Y, Wang B, Azuaje F: Gene set analysis in the cloud. Bioinformatics 2012, 28(2):294-295.
- [23]Wall DP, Kudtarkar P, Fusaro VA, Pivovarov R, Patil P, Tonellato PJ: Cloud computing for comparative genomics. BMC Bioinformatics 2010, 11:259. BioMed Central Full Text
- [24]Feng X, Grossman R, Stein L: PeakRanger: a cloud-enabled peak caller for ChIP-seq data. BMC Bioinformatics 2011, 12:139. BioMed Central Full Text
- [25]Habegger L, Balasubramanian S, Chen DZ, Khurana E, Sboner A, Harmanci A, Rozowsky J, Clarke D, Snyder M, Gerstein M: VAT: a computational framework to functionally annotate variants in personal genomes within a cloud-computing environment. Bioinformatics 2012. Epub ahead of print
- [26]Wang Z, Wang Y, Tan KL, Wong L, Agrawal D: eCEO: an efficient Cloud Epistasis cOmputing model in genome-wide association study. Bioinformatics 2011, 27(8):1045-1051.
- [27]Jourdren L, Bernard M, Dillies M-A, Le Crom S: Eoulsan: a cloud computing-based framework facilitating high throughput sequencing analyses. Bioinformatics 2012. published online April 5, 2012
- [28]Afgan E, Baker D, Coraor N, Goto H, Paul IM, Makova KD, Nekrutenko A, Taylor J: Harnessing cloud computing with Galaxy Cloud. Nat Biotechnol 2011, 29(11):972-974.
- [29]Afgan E, Baker D, Coraor N, Chapman B, Nekrutenko A, Taylor J: Galaxy CloudMan: delivering cloud compute clusters. BMC Bioinformatics 2010, 11(Suppl 12):S4. BioMed Central Full Text
- [30]Krampis K, Booth T, Chapman B, Tiwari B, Bicak M, Field D, Nelson K: Cloud BioLinux: pre-configured and on-demand bioinformatics computing for the genomics community. BMC Bioinformatics 2012, 13(1):42. BioMed Central Full Text
- [31]Angiuoli SV, Matalka M, Gussman A, Galens K, Vangala M, Riley DR, Arze C, White JR, White O, Fricke WF: CloVR: a virtual machine for automated and portable sequence analysis from the desktop using cloud computing. BMC Bioinformatics 2011, 12:356. BioMed Central Full Text
- [32]Dudley JT, Pouliot Y, Chen R, Morgan AA, Butte AJ: Translational bioinformatics in the cloud: an affordable alternative. Genome Med 2010, 2(8):51. BioMed Central Full Text
- [33]Zhang Z, Bajic VB, Yu J, Cheung K-H, Townsend JP: Data Integration in Bioinformatics: Current Efforts and Challenges. In Bioinformatics - Trends and Methodologies. Edited by Mahdavi MA. Rijeka, Croatia: InTech - Open Access Publisher; 2011.
- [34]Fox A: Cloud computing-what's in it for me as a scientist? Science 2011, 331(6016):406-407.
- [35]Deorowicz S, Grabowski S: Compression of DNA sequence reads in FASTQ format. Bioinformatics 2011, 27(6):860-862.
- [36]Cox AJ, Bauer MJ, Jakobi T, Rosone G: Large-scale compression of genomic sequence databases with the Burrows-Wheeler transform. Bioinformatics 2012, 28(11):1415-1419.
- [37]Langille MGI, Eisen JA: BioTorrents: a file sharing service for scientific data. PLoS One 2010, 5(4):e10071.
- [38]Sangket U, Phongdara A, Chotigeat W, Nathan D, Kim WY, Bhak J, Ngamphiw C, Tongsima S, Khan AM, Lin H, et al.: Automatic synchronization and distribution of biological databases and software over low-bandwidth networks among developing countries. Bioinformatics 2008, 24(2):299-301.
- [39]Bishop M: e-Science. Brief Bioinform 2003, 4(3):208-209.
- [40]Zhang Z, Cheung KH, Townsend JP: Bringing Web 2.0 to bioinformatics. Brief Bioinform 2009, 10(1):1-10.
- [41]Marx V: My data are your data. Nat Biotechnol 2012, 30(6):509-511.
- [42]Rosenthal A, Mork P, Li MH, Stanford J, Koester D, Reynolds P: Cloud computing: a new business paradigm for biomedical information sharing. J Biomed Inform 2010, 43(2):342-353.
- [43]Dillon T, Wu C, Chang E: Cloud Computing: Issues and Challenges. Int Con Adv Info Net 2011, 27-33.
- [44]Parameswaran AV, Chaddha A: Cloud interoperability and standardization. SETLabs Briefings 2009, 7(7):19-26.