科技报告

【摘要】

The World Wide Web provides an incredible resource to genomics researchers in the form of dynamic data sources-e.g. BLAST sequence homology search interfaces. The growth rate of these sources outpaces the speed at which they can be manually classified, meaning that the available data is not being utilized to its full potential. Existing research has not addressed the problems of automatically locating, classifying, and integrating classes of bioinformatics data sources. This paper presents an overview of a system for finding classes of bioinformatics data sources and integrating them behind a unified interface. We examine an approach to classifying these sources automatically that relies on an abstract description format: the service class description. This format allows a domain expert to describe the important features of an entire class of services without tying that description to any particular Web source. We present the features of this description format in the context of BLAST sources to show how the service class description relates to Web sources that are being described. We then show how a service class description can be used to classify an arbitrary Web source to determine if that source is an instance of the described service. To validate the effectiveness of this approach, we have constructed a prototype that can correctly classify approximately two-thirds of the BLAST sources we tested. We then examine these results, consider the factors that affect correct automatic classification, and discuss future work.

【预览】

附件列表
Files	Size	Format	View
DE200415006274.pdf	140KB	PDF	download


Abstract Description Approach to the Discovery and Classification of Bioinformatics Web Sources.

Rocco, D. ; Critchlow, T.
Technical Information Center Oak Ridge Tennessee
关键词: Bioinformatics; Data systems; Computer programs; Genome library; World Wide Web;
RP-ID : DE200415006274
学科分类：工程和技术（综合）
美国\|英语
来源: National Technical Reports Library
PDF


	文献评价指标
	下载次数：26次	浏览次数：19次

【 摘 要 】

【 预 览 】

【摘要】

【预览】