期刊论文详细信息
Applied Sciences
You Don’t Need Labeled Data for Open-Book Question Answering
Sia Gholami1  Mehdi Noori1 
[1] Amazon Web Services, San Francisco, CA 94111, USA;
关键词: AWS technical documentation;    extractive language models;    information retrieval systems;    zero-shot open-book question answering;   
DOI  :  10.3390/app12010111
来源: DOAJ
【 摘 要 】

Open-book question answering is a subset of question answering (QA) tasks where the system aims to find answers in a given set of documents (open-book) and common knowledge about a topic. This article proposes a solution for answering natural language questions from a corpus of Amazon Web Services (AWS) technical documents with no domain-specific labeled data (zero-shot). These questions have a yes–no–none answer and a text answer which can be short (a few words) or long (a few sentences). We present a two-step, retriever–extractor architecture in which a retriever finds the right documents and an extractor finds the answers in the retrieved documents. To test our solution, we are introducing a new dataset for open-book QA based on real customer questions on AWS technical documentation. In this paper, we conducted experiments on several information retrieval systems and extractive language models, attempting to find the yes–no–none answers and text answers in the same pass. Our custom-built extractor model is created from a pretrained language model and fine-tuned on the the Stanford Question Answering Dataset—SQuAD and Natural Questions datasets. We were able to achieve 42% F1 and 39% exact match score (EM) end-to-end with no domain-specific training.

【 授权许可】

Unknown   

  文献评价指标  
  下载次数:0次 浏览次数:0次