期刊论文详细信息
Biology Direct
Bioinformatics clouds for big data manipulation
Lin Dai2  Xin Gao1  Yan Guo4  Jingfa Xiao3  Zhang Zhang3 
[1] Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia
[2] School of Computer Science and Technology, Beijing Institute of Technology, Beijing, 100081, China
[3] CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, No.7 Beitucheng West Road, Building G, Chaoyang District, Beijing, 100029, China
[4] Cloud Development and Cloud Solution Integration, IBM China Systems & Technology Lab, IBM Co. Ltd, Beijing, 100193, China
关键词: Data analysis;    Data storage;    Big data;    Bioinformatics;    Cloud computing;   
Others  :  795261
DOI  :  10.1186/1745-6150-7-43
 received in 2012-09-19, accepted in 2012-11-26,  发布年份 2012
PDF
【 摘 要 】

As advances in life sciences and information technology bring profound influences on bioinformatics due to its interdisciplinary nature, bioinformatics is experiencing a new leap-forward from in-house computing infrastructure into utility-supplied cloud computing delivered over the Internet, in order to handle the vast quantities of biological data generated by high-throughput experimental technologies. Albeit relatively new, cloud computing promises to address big data storage and analysis issues in the bioinformatics field. Here we review extant cloud-based services in bioinformatics, classify them into Data as a Service (DaaS), Software as a Service (SaaS), Platform as a Service (PaaS), and Infrastructure as a Service (IaaS), and present our perspectives on the adoption of cloud computing in bioinformatics.

【 授权许可】

   
2012 Dai et al.; licensee BioMed Central Ltd.

【 预 览 】
附件列表
Files Size Format View
20140705084025184.pdf 319KB PDF download
Figure 1. 127KB Image download
【 图 表 】

Figure 1.

【 参考文献 】
  • [1]Schatz MC, Langmead B, Salzberg SL: Cloud computing and the DNA data race. Nat Biotechnol 2010, 28(7):691-693.
  • [2]Eisenstein M: Oxford Nanopore announcement sets sequencing sector abuzz. Nat Biotechnol 2012, 30(4):295-296.
  • [3]Schadt EE, Linderman MD, Sorenson J, Lee L, Nolan GP: Cloud and heterogeneous computing solutions exist today for the emerging big data problems in biology. Nat Rev Genet 2011, 12(3):224.
  • [4]Schadt EE, Linderman MD, Sorenson J, Lee L, Nolan GP: Computational solutions to large-scale data management and analysis. Nat Rev Genet 2010, 11(9):647-657.
  • [5]Grossman RL, White KP: A vision for a biomedical cloud. J Intern Med 2012, 271(2):122-130.
  • [6]Armbrust M, Fox A, Griffith R, Joseph AD, Katz RH, Konwinski A, Lee G, Patterson DA, Rabkin A, Stoica I, et al.: Above the Clouds: A Berkeley View of Cloud Computing. Berkeley: EECS Department, University of California; 2009.
  • [7]Garfinkel SL: Architects of the Information Society: Thirty-Five Years of the Laboratory for Computer Science at MIT. Cambridge, MA: The MIT Press; 1999.
  • [8]Buyya R, Yeo CS, Venugopal S, Broberg J, Brandic I: Cloud computing and emerging IT platforms: vision, hype, and reality for delivering computing as the 5th utility. Future Gener Comp Sy 2009, 25(6):599-616.
  • [9]Dudley JT, Butte AJ: In silico research in the era of cloud computing. Nat Biotechnol 2010, 28(11):1181-1185.
  • [10]Stein LD: The case for cloud computing in genome informatics. Genome Biol 2010, 11(5):207. BioMed Central Full Text
  • [11]Taylor RC: An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics. BMC Bioinformatics 2010, 11(Suppl 12):S1. BioMed Central Full Text
  • [12]Stanoevska-Slabeva K, Wozniak T: Cloud Basics - An Introduction to Cloud Computing. In Grid and Cloud Computing: Business Perspective on Technology and Applications. Edited by Stanoevska K, Wozniak T, Ristol S. Berlin: Springer; 2010:47-61.
  • [13]Truong HL, Dustdar S: On Analyzing and Specifying Concerns for Data as a Service. 2009 Ieee Asia-Pacific Services Computing Conference (Apscc 2009) 2009, 83-90.
  • [14]DaaS: The New Information Goldmine. http://online.wsj.com/article/SB125071202052143965.html webcite
  • [15]Fusaro VA, Patil P, Gafni E, Wall DP, Tonellato PJ: Biomedical cloud computing with Amazon Web Services. PLoS Comput Biol 2011, 7(8):e1002147.
  • [16]Nguyen T, Shi W, Ruden D: CloudAligner: a fast and full-featured MapReduce based tool for sequence mapping. BMC Res Notes 2011, 4:171. BioMed Central Full Text
  • [17]Schatz MC: CloudBurst: highly sensitive read mapping with MapReduce. Bioinformatics 2009, 25(11):1363-1369.
  • [18]Langmead B, Schatz MC, Lin J, Pop M, Salzberg SL: Searching for SNPs with cloud computing. Genome Biol 2009, 10(11):R134. BioMed Central Full Text
  • [19]Matsunaga A, Tsugawa M, Fortes J: Combining MapReduce and Virtualization on Distributed Resources for Bioinformatics Applications. Fourth IEEE International Conference on eScience 2008, 222-229.
  • [20]Hong D, Rhie A, Park SS, Lee J, Ju YS, Kim S, Yu SB, Bleazard T, Park HS, Rhee H, et al.: FX: an RNA-Seq analysis tool on the cloud. Bioinformatics 2012, 28(5):721-723.
  • [21]Langmead B, Hansen KD, Leek JT: Cloud-scale RNA-sequencing differential expression analysis with Myrna. Genome Biol 2010, 11(8):R83. BioMed Central Full Text
  • [22]Zhang L, Gu S, Liu Y, Wang B, Azuaje F: Gene set analysis in the cloud. Bioinformatics 2012, 28(2):294-295.
  • [23]Wall DP, Kudtarkar P, Fusaro VA, Pivovarov R, Patil P, Tonellato PJ: Cloud computing for comparative genomics. BMC Bioinformatics 2010, 11:259. BioMed Central Full Text
  • [24]Feng X, Grossman R, Stein L: PeakRanger: a cloud-enabled peak caller for ChIP-seq data. BMC Bioinformatics 2011, 12:139. BioMed Central Full Text
  • [25]Habegger L, Balasubramanian S, Chen DZ, Khurana E, Sboner A, Harmanci A, Rozowsky J, Clarke D, Snyder M, Gerstein M: VAT: a computational framework to functionally annotate variants in personal genomes within a cloud-computing environment. Bioinformatics 2012. Epub ahead of print
  • [26]Wang Z, Wang Y, Tan KL, Wong L, Agrawal D: eCEO: an efficient Cloud Epistasis cOmputing model in genome-wide association study. Bioinformatics 2011, 27(8):1045-1051.
  • [27]Jourdren L, Bernard M, Dillies M-A, Le Crom S: Eoulsan: a cloud computing-based framework facilitating high throughput sequencing analyses. Bioinformatics 2012. published online April 5, 2012
  • [28]Afgan E, Baker D, Coraor N, Goto H, Paul IM, Makova KD, Nekrutenko A, Taylor J: Harnessing cloud computing with Galaxy Cloud. Nat Biotechnol 2011, 29(11):972-974.
  • [29]Afgan E, Baker D, Coraor N, Chapman B, Nekrutenko A, Taylor J: Galaxy CloudMan: delivering cloud compute clusters. BMC Bioinformatics 2010, 11(Suppl 12):S4. BioMed Central Full Text
  • [30]Krampis K, Booth T, Chapman B, Tiwari B, Bicak M, Field D, Nelson K: Cloud BioLinux: pre-configured and on-demand bioinformatics computing for the genomics community. BMC Bioinformatics 2012, 13(1):42. BioMed Central Full Text
  • [31]Angiuoli SV, Matalka M, Gussman A, Galens K, Vangala M, Riley DR, Arze C, White JR, White O, Fricke WF: CloVR: a virtual machine for automated and portable sequence analysis from the desktop using cloud computing. BMC Bioinformatics 2011, 12:356. BioMed Central Full Text
  • [32]Dudley JT, Pouliot Y, Chen R, Morgan AA, Butte AJ: Translational bioinformatics in the cloud: an affordable alternative. Genome Med 2010, 2(8):51. BioMed Central Full Text
  • [33]Zhang Z, Bajic VB, Yu J, Cheung K-H, Townsend JP: Data Integration in Bioinformatics: Current Efforts and Challenges. In Bioinformatics - Trends and Methodologies. Edited by Mahdavi MA. Rijeka, Croatia: InTech - Open Access Publisher; 2011.
  • [34]Fox A: Cloud computing-what's in it for me as a scientist? Science 2011, 331(6016):406-407.
  • [35]Deorowicz S, Grabowski S: Compression of DNA sequence reads in FASTQ format. Bioinformatics 2011, 27(6):860-862.
  • [36]Cox AJ, Bauer MJ, Jakobi T, Rosone G: Large-scale compression of genomic sequence databases with the Burrows-Wheeler transform. Bioinformatics 2012, 28(11):1415-1419.
  • [37]Langille MGI, Eisen JA: BioTorrents: a file sharing service for scientific data. PLoS One 2010, 5(4):e10071.
  • [38]Sangket U, Phongdara A, Chotigeat W, Nathan D, Kim WY, Bhak J, Ngamphiw C, Tongsima S, Khan AM, Lin H, et al.: Automatic synchronization and distribution of biological databases and software over low-bandwidth networks among developing countries. Bioinformatics 2008, 24(2):299-301.
  • [39]Bishop M: e-Science. Brief Bioinform 2003, 4(3):208-209.
  • [40]Zhang Z, Cheung KH, Townsend JP: Bringing Web 2.0 to bioinformatics. Brief Bioinform 2009, 10(1):1-10.
  • [41]Marx V: My data are your data. Nat Biotechnol 2012, 30(6):509-511.
  • [42]Rosenthal A, Mork P, Li MH, Stanford J, Koester D, Reynolds P: Cloud computing: a new business paradigm for biomedical information sharing. J Biomed Inform 2010, 43(2):342-353.
  • [43]Dillon T, Wu C, Chang E: Cloud Computing: Issues and Challenges. Int Con Adv Info Net 2011, 27-33.
  • [44]Parameswaran AV, Chaddha A: Cloud interoperability and standardization. SETLabs Briefings 2009, 7(7):19-26.
  文献评价指标  
  下载次数:24次 浏览次数:12次