期刊论文详细信息
BMC Bioinformatics
search GenBank: interactive orchestration and ad-hoc choreography of Web services in the exploration of the biomedical resources of the National Center For Biotechnology Information
Dariusz Mrozek2  Bożena Małysiak-Mrozek2  Artur Siążnik1 
[1] Institute of Informatics, Silesian University of Technology, Akademicka 16, Gliwice, 44-100, Poland
[2] IBM Competence Center, Silesian University of Technology, Akademicka 16, Gliwice, 44-100, Poland
关键词: Choreography;    Orchestration;    Web services;    Data querying;    Data searching;    Data exploration;    Entrez programming utilities;    Entrez search engine;    Entrez databases;    NCBI entrez;   
Others  :  1087961
DOI  :  10.1186/1471-2105-14-73
 received in 2012-08-01, accepted in 2013-02-22,  发布年份 2013
PDF
【 摘 要 】

Background

Due to the growing number of biomedical entries in data repositories of the National Center for Biotechnology Information (NCBI), it is difficult to collect, manage and process all of these entries in one place by third-party software developers without significant investment in hardware and software infrastructure, its maintenance and administration. Web services allow development of software applications that integrate in one place the functionality and processing logic of distributed software components, without integrating the components themselves and without integrating the resources to which they have access. This is achieved by appropriate orchestration or choreography of available Web services and their shared functions. After the successful application of Web services in the business sector, this technology can now be used to build composite software tools that are oriented towards biomedical data processing.

Results

We have developed a new tool for efficient and dynamic data exploration in GenBank and other NCBI databases. A dedicated search GenBank system makes use of NCBI Web services and a package of Entrez Programming Utilities (eUtils) in order to provide extended searching capabilities in NCBI data repositories. In search GenBank users can use one of the three exploration paths: simple data searching based on the specified user’s query, advanced data searching based on the specified user’s query, and advanced data exploration with the use of macros. search GenBank orchestrates calls of particular tools available through the NCBI Web service providing requested functionality, while users interactively browse selected records in search GenBank and traverse between NCBI databases using available links. On the other hand, by building macros in the advanced data exploration mode, users create choreographies of eUtils calls, which can lead to the automatic discovery of related data in the specified databases.

Conclusions

search GenBank extends standard capabilities of the NCBI Entrez search engine in querying biomedical databases. The possibility of creating and saving macros in the search GenBank is a unique feature and has a great potential. The potential will further grow in the future with the increasing density of networks of relationships between data stored in particular databases. search GenBank is available for public use at http://sgb.biotools.pl/ webcite.

【 授权许可】

   
2013 Mrozek et al.; licensee BioMed Central Ltd.

【 预 览 】
附件列表
Files Size Format View
20150117061721717.pdf 3774KB PDF download
Figure 11. 80KB Image download
Figure 10. 63KB Image download
Figure 9. 77KB Image download
Figure 8. 83KB Image download
Figure 7. 71KB Image download
Figure 6. 64KB Image download
Figure 5. 48KB Image download
Figure 4. 52KB Image download
Figure 3. 74KB Image download
Figure 2. 78KB Image download
Figure 1. 86KB Image download
【 图 表 】

Figure 1.

Figure 2.

Figure 3.

Figure 4.

Figure 5.

Figure 6.

Figure 7.

Figure 8.

Figure 9.

Figure 10.

Figure 11.

【 参考文献 】
  • [1]Bilofsky HS, Burks C, Fickett JW, Goad WB, Lewitter FI, Rindone WP, Swindell CD, Tung CS: The GenBank genetic sequence databank. Nucleic Acids Res 1986, 14(1):1-4.
  • [2]Mizrachi I, GenBank: The nucleotide sequence database. The NCBI handbook [internet]. Edited by McEntyre J, Ostell J. Bethesda (MD): National Center for Biotechnology Information (US); 2002. Updated 2007) [http://www.ncbi.nlm.nih.gov/books/NBK21105/ webcite
  • [3]Hogue C, Ohkawa H, Bryant S: A dynamic look at structures: WWW-entrez and the molecular modeling database. Trends Biochem Sci 1996, 21:226-229.
  • [4]Ostell J: The entrez search and retrieval system. The NCBI handbook [internet]. Edited by McEntyre J, Ostell J. Bethesda (MD): National Center for Biotechnology Information (US) 2002; 2003. http://www.ncbi.nlm.nih.gov/books/NBK21081/ webcite
  • [5]Sayers EW, Barrett T, Benson DA, Bolton E, Bryant SH, Canese K, Chetvernin V, Church DM, Dicuccio M, Federhen S, Feolo M, Fingerman IM, Geer LY, Helmberg W, Kapustin Y, Krasnov S, Landsman D, Lipman DJ, Lu Z, Madden TL, Madej T, Maglott DR, Marchler-Bauer A, Miller V, Karsch-Mizrachi I, Ostell J, Panchenko A, Phan L, Pruitt KD, Schuler GD, Sequeira E, Sherry ST, Shumway M, Sirotkin K, Slotta D, Souvorov A, Starchenko G, Tatusova TA, Wagner L, Wang Y, Wilbur WJ, Yaschenko E, Ye J: Database resources of the national center for biotechnology information. Nucleic Acids Res 2012, 40(Database issue):D13-D25.
  • [6]McEntyre J, Lipman D: PubMed: bridging the information gap. CMAJ 2001, 164(9):1317-1319.
  • [7]Canese K, Jentsch J, Myers C, PubMed: The bibliographic database. The NCBI handbook [internet]. Edited by McEntyre J, Ostell J. Bethesda (MD): National Center for Biotechnology Information (US); 2002. http://www.ncbi.nlm.nih.gov/books/NBK21094/ webcite
  • [8]Federhen S: The NCBI taxonomy database. Nucleic Acids Res 2012, 40(Database issue):D136-D143.
  • [9]Marchler-Bauer A, Addess KJ, Chappey C, Geer L, Madej T, Matsuo Y, Wang Y, Bryant SH: MMDB: entrez’s 3D structure database. Nucleic Acids Res 1999, 27(1):240-243.
  • [10]Tatusova TA, Karsch-Mizrachi I, Ostell JA: Complete genomes in WWW entrez: data representation and analysis. Bioinformatics 1999, 15(7–8):536-543.
  • [11]Amberger J, Bocchini CA, Scott AF, Hamosh A: McKusick’s online mendelian inheritance in man (OMIM). Nucleic Acids Res 2009, 37:D793-D796.
  • [12]Sherry ST, Ward MH, Kholodov M, Baker J, Phan L, Smigielski EM, Sirotkin K: dbSNP: the NCBI database of genetic variation. Nucleic Acids Res 2001, 29:308-311.
  • [13]Haas H, Brown A: Web services glossary. W3C Working Group Note 2004. http://www.w3.org/TR/ws-gloss/ webcite
  • [14]Erl T: Service-oriented architecture (SOA). Upper Saddle River, NJ: Prentice Hall; 2005. [Concepts, technology, and design]
  • [15]Sosinsky B: Cloud computing bible. 1st edition. Indianapolis, IN: Wiley; 2011.
  • [16]Bray T, Paoli J, Sperberg-McQueen CM, Maler E, Yergeau F, Cowan J: Extensible markup language (XML) 1.1. Second edition. 2006. [W3C recommendation] http://www.w3.org/TR/2006/REC-xml11-20060816/ webcite
  • [17]Chinnici R, Gudgin M, Moreau JJ, Weerawarana S: Web services description language (WSDL) version 1.2. W3C Working Draft 2002. http://www.w3.org/TR/2002/WD-wsdl12-20020709/ webcite
  • [18]Gudgin M, Hadley M, Mendelsohn N, Moreau JJ, Nielsen HF, Karmarkar A, Lafon Y: SOAP version 1.2 part 1: messaging framework. Second edition. http://www.w3.org/TR/soap12-part1/ webcite
  • [19]Clement L, Hately A, von Riegen C, Rogers T: UDDI version 3.0.2. UDDI Spec Technical Committee Draft 2004. http://uddi.org/pubs/uddi_v3.htm webcite
  • [20]Peltz C: Web services orchestration and choreography. Computer 2003, 36(10):46-52.
  • [21]Barker A, Walton CD, Robertson D: Choreographing Web services. IEEE Transact Serv Comp, IEEE Comp Soc 2009, 2(2):152-166.
  • [22]Hull D, Wolstencroft K, Stevens R, Goble C, Pocock MR, Li P, Oinn T: Taverna: a tool for building and running workflows of services. Nucleic Acids Res 2006, 34(Web Server issue):729-732.
  • [23]Wilkinson MD, Vandervalk BP, McCarthy EL: SADI SemanticWeb services - ‘cause you can’t always GET what you want!. Singapore: IEEE Press; 2009:13-18. [Proceedings of the Asia-pacific services computing conference]
  • [24]Altova MapForce 2013 User & Reference Manualhttp://www.altova.com/documents/MapForceEnt.pdf webcite
  • [25]Haselden K: Microsoft SQL server 2008 integration services unleashed. 1st edition. Indianapolis, IN: Sams; 2009.
  • [26]Skupien J, Gorczynska-Kosiorz S, Klupa T, Cyganek K, Wanic K, Borowiec M, Sieradzki J, Malecki MT: Molecular background and clinical characteristics of HNF1A MODY in a polish population. Diabetes Metab 2008, 34(5):524-528.
  • [27]Sayers E, Wheeler D: Building customized data pipelines using the entrez programming utilities (eUtils). Bethesda (MD): National Center for Biotechnology Information (US); 2004. [NCBI short courses [internet]] http://www.ncbi.nlm.nih.gov/books/NBK1058/ webcite
  • [28]Sayers E: A general introduction to the E-utilities. Bethesda (MD): National Center for Biotechnology Information (US); 2010. [Entrez programming utilities help [internet]] Updated 2011) [http://www.ncbi.nlm.nih.gov/books/NBK25497/ webcite
  • [29]Sayers E, Miller V: Overview of the E-utility Web service (SOAP). Bethesda (MD): National Center for Biotechnology Information (US); 2010. [Entrez programming utilities help [internet]] Updated 2012) [http://www.ncbi.nlm.nih.gov/books/NBK43082/ webcite
  • [30]Sayers E: The E-utilities in-depth: parameters, syntax and more. Bethesda (MD): National Center for Biotechnology Information (US); 2010. [Entrez programming utilities help [internet]] Updated 2012) [http://www.ncbi.nlm.nih.gov/books/NBK25499/ webcite
  • [31]Schuler GD: Pieces of the puzzle: expressed sequence tags and the catalog of human genes. J Mol Med 1997, 75:694-698.
  • [32]Maglott D, Ostell J, Pruitt KD, Tatusova T: Entrez gene: gene-centered information at NCBI. Nucleic Acids Res 2011, 39:D52-D57.
  • [33]Sequeira E: PubMed central – 3 years old and growing stronger. ARL 2003, 228:5-9.
  文献评价指标  
  下载次数:117次 浏览次数:24次