IEEE Access | |
QA4IE: A Question Answering Based System for Document-Level General Information Extraction | |
Quanyu Long1  Lin Qiu1  Weinan Zhang1  Dongyu Ru1  Yong Yu1  | |
[1] APEX Data & Knowledge Management Lab, Shanghai Jiao Tong University, Shanghai, China; | |
关键词: Knowledge acquisition; machine learning; natural language processing; neural networks; | |
DOI : 10.1109/ACCESS.2020.2970119 | |
来源: DOAJ |
【 摘 要 】
Information Extraction (IE) is the task of distilling structured information from unstructured texts by identifying references to named entities as well as relationships between such entities. Existing IE solutions, including Relation Extraction and Open IE, can hardly take cross-sentence information like coreferences into account and are severely restricted by limited relation types as well as informal relation specifications (e.g., free-text based relation triples). In order to overcome the weaknesses, we propose a novel IE framework named QA4IE, which leverages the flexible question answering approaches to produce high-quality relation triples across sentences. Based on this framework, we develop a real-time IE system, which can perform general IE throughout the entire document. For training and evaluating our system, we build a large-scale IE benchmark using distant supervision under human evaluation. We deploy both component analyses and pipeline experiments to evaluate our system. The results show that our system can generalize on unseen entities and relations, as well as achieve significant improvements over existing IE systems.
【 授权许可】
Unknown