This thesis explores information extraction (IE) in extit{low-resource} conditions, in which the quantity of high-quality human annotations are insufficient to fit statistical machine learning models. Such conditions increasingly arise in domains where annotations are expensive to obtain, such as biomedicine, or domains that are rapidly changing, such as social media, where annotations easily become out-of-date. It is crucial to leverage as many learning signals and as much human knowledge as possible to mitigate the problem of inadequate supervision.In this thesis, we focus on two typical IE tasks: named entity recognition (NER) and entity relation extraction (RE). We explore two directions to help information extraction with limited supervision: 1). learning representations/knowledge from heterogeneous sources using deep neural networks and transferring the learned knowledge; and 2). incorporating structural knowledge into the design of the models to learn robust representations and make holistic decisions. Specifically, for the application of NER, we explore transfer learning, including multi-task learning, domain adaptation, and multi-task domain adaptation, in the context of neural representation learning in order to help transfer learned knowledge from related tasks and domains to the problem of interest.For the applications of entity relation extraction and joint entity recognition and relation extraction, we explore incorporating linguistic structure and domain knowledge into the design of the models, thus yielding more robust systems with less supervision.
【 预 览 】
附件列表
Files
Size
Format
View
Jointly Learning Representations for Low-Resource Information Extraction