期刊论文详细信息
Computer Science and Information Systems
Research on Discovering Deep Web Entries
Ying Wang1  Wanli Zuo2  Fengling He3  Huilai Li4 
[1] College of Computer Science and Technology, Jilin University;College of Mathematics, Jilin University;College of Software, Changchun Institute of Technology,;Key Laboratory of Computation and Knowledge Engineering,
关键词: Deep Web;    ontology;    WPC;    FSC;    FCC;   
DOI  :  10.2298/CSIS100322028W
学科分类:社会科学、人文和艺术(综合)
来源: Computer Science and Information Systems
PDF
【 摘 要 】

Ontology plays an important role in locating Domain-Specific Deep Web contents, therefore, this paper presents a novel framework WFF for efficiently locating Domain-Specific Deep Web databases based on focused crawling and ontology by constructing Web Page Classifier(WPC), Form Structure Classifier(FSC) and Form Content Classifier(FCC) in a hierarchical fashion. Firstly, WPC discovers potentially interesting pages based on ontology-assisted focused crawler. Then, FSC analyzes the interesting pages and determines whether these pages subsume searchable forms based on structural characteristics. Lastly, FCC identifies searchable forms that belong to a given domain in the semantic level, and stores these URLs of Domain-Specific searchable forms to a database. Through a detailed experimental evaluation, WFF framework not only simplifies discovering process, but also effectively determines Domain-Specific databases.

【 授权许可】

CC BY-NC-ND   

【 预 览 】
附件列表
Files Size Format View
RO201904021556470ZK.pdf 460KB PDF download
  文献评价指标  
  下载次数:14次 浏览次数:17次