BMC Bioinformatics | |
The taxonomic name resolution service: an online tool for automated standardization of plant names | |
Software | |
Dmitry Mozzherin1  Robert K Peet2  Brian J Enquist3  Brad Boyle4  Naim Matasci5  Tony Rees6  Chris Freeland7  Juan Antonio Raygoza Garay8  Nicole Hopkins8  Martha L Narro8  Sonya Lowry8  Sheldon J Mckay9  Zhenyuan Lu1,10  William H Piel1,11  | |
[1] Center for Library and Informatics, Marine Biological Laboratory, 7 MBL street, 7 MBL Street, 02543, Woods Hole, MA, USA;Department of Biology, CB 3280, University of North Carolina, 27599-3280, Chapel Hill, NC, USA;Department of Ecology and Evolutionary Biology, University of Arizona Tucson, P.O. Box 210088, 85721, Tucson, AZ, USA;The Santa Fe Institute, 1399 Hyde Park Road, 87501, Santa Fe, NM, USA;Department of Ecology and Evolutionary Biology, University of Arizona Tucson, P.O. Box 210088, 85721, Tucson, AZ, USA;The iPlant Collaborative, Thomas W. Keating Bioresearch Building, 1657 East Helen Street, 85721, Tucson, AZ, USA;Department of Ecology and Evolutionary Biology, University of Arizona Tucson, P.O. Box 210088, 85721, Tucson, AZ, USA;The iPlant Collaborative, Thomas W. Keating Bioresearch Building, 1657 East Helen Street, 85721, Tucson, AZ, USA;BIO5 Institute, PO Box 210240, 1657 East Helen Street, 85721-0240, Tucson, AZ, USA;Divisional Data Centre, CSIRO Marine and Atmospheric Research, GPO Box 1538, 7001, Hobart, Tasmania, Australia;Missouri Botanical Garden, 4344 Shaw Blvd. | |
[2] , 63110, St. Louis, MO, USA;The iPlant Collaborative, Thomas W. Keating Bioresearch Building, 1657 East Helen Street, 85721, Tucson, AZ, USA;BIO5 Institute, PO Box 210240, 1657 East Helen Street, 85721-0240, Tucson, AZ, USA;The iPlant Collaborative, Thomas W. Keating Bioresearch Building, 1657 East Helen Street, 85721, Tucson, AZ, USA;BIO5 Institute, PO Box 210240, 1657 East Helen Street, 85721-0240, Tucson, AZ, USA;Cold Spring Harbor Laboratory, 1 Bungtown Road, 11724-2202, Cold Spring Harbor, NY, USA;The iPlant Collaborative, Thomas W. Keating Bioresearch Building, 1657 East Helen Street, 85721, Tucson, AZ, USA;Cold Spring Harbor Laboratory, 1 Bungtown Road, 11724-2202, Cold Spring Harbor, NY, USA;Yale-NUS College, 6 College Avenue East, 138614, Singapore, Singapore; | |
关键词: Biodiversity informatics; Database integration; Taxonomy; Plants; | |
DOI : 10.1186/1471-2105-14-16 | |
received in 2012-09-25, accepted in 2013-01-02, 发布年份 2013 | |
来源: Springer | |
【 摘 要 】
BackgroundThe digitization of biodiversity data is leading to the widespread application of taxon names that are superfluous, ambiguous or incorrect, resulting in mismatched records and inflated species numbers. The ultimate consequences of misspelled names and bad taxonomy are erroneous scientific conclusions and faulty policy decisions. The lack of tools for correcting this ‘names problem’ has become a fundamental obstacle to integrating disparate data sources and advancing the progress of biodiversity science.ResultsThe TNRS, or Taxonomic Name Resolution Service, is an online application for automated and user-supervised standardization of plant scientific names. The TNRS builds upon and extends existing open-source applications for name parsing and fuzzy matching. Names are standardized against multiple reference taxonomies, including the Missouri Botanical Garden's Tropicos database. Capable of processing thousands of names in a single operation, the TNRS parses and corrects misspelled names and authorities, standardizes variant spellings, and converts nomenclatural synonyms to accepted names. Family names can be included to increase match accuracy and resolve many types of homonyms. Partial matching of higher taxa combined with extraction of annotations, accession numbers and morphospecies allows the TNRS to standardize taxonomy across a broad range of active and legacy datasets.ConclusionsWe show how the TNRS can resolve many forms of taxonomic semantic heterogeneity, correct spelling errors and eliminate spurious names. As a result, the TNRS can aid the integration of disparate biological datasets. Although the TNRS was developed to aid in standardizing plant names, its underlying algorithms and design can be extended to all organisms and nomenclatural codes. The TNRS is accessible via a web interface at http://tnrs.iplantcollaborative.org/ and as a RESTful web service and application programming interface. Source code is available at https://github.com/iPlantCollaborativeOpenSource/TNRS/.
【 授权许可】
CC BY
© Boyle et al.; licensee BioMed Central Ltd. 2013
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
RO202311097404133ZK.pdf | 1781KB | download |
【 参考文献 】
- [1]
- [2]
- [3]
- [4]
- [5]
- [6]
- [7]
- [8]
- [9]
- [10]
- [11]
- [12]
- [13]
- [14]
- [15]
- [16]
- [17]
- [18]
- [19]
- [20]
- [21]
- [22]
- [23]
- [24]
- [25]
- [26]
- [27]
- [28]
- [29]
- [30]
- [31]
- [32]
- [33]
- [34]
- [35]
- [36]
- [37]
- [38]
- [39]
- [40]
- [41]
- [42]
- [43]
- [44]
- [45]
- [46]
- [47]
- [48]
- [49]
- [50]
- [51]
- [52]
- [53]
- [54]
- [55]
- [56]
- [57]
- [58]
- [59]
- [60]
- [61]
- [62]
- [63]
- [64]
- [65]
- [66]
- [67]
- [68]
- [69]
- [70]
- [71]
- [72]
- [73]
- [74]