期刊论文详细信息
ECTI Transactions on Computer and Information Technology
Machine Reading Comprehension Using Multi-Passage BERT with Dice Loss on Thai Corpus
article
Theerit Lapchaicharoenkit1  Peerapon Vateekeul1 
[1] Chulalongkorn University
关键词: Machine Reading Comprehension;    natural language processing;    Deep Learning;    BERT;   
DOI  :  10.37936/ecti-cit.2022162.247799
学科分类:医学(综合)
来源: Electrical Engineering/Electronics, Computer, Communications and Information Technology Association
PDF
【 摘 要 】

Nowadays there is an advancement in the field of machine reading comprehension task (MRC) due to the invention of large scale pre-trained language models, such as BERT. However, the performance is still limited when the context is long and contains many passages. BERT can only embed a part of the whole passage equal to the input size; thus, sliding windows must be used which leads to discontinued information when the passage is long. In this paper, we aim to propose a BERT-based MRC framework tailored for a long passage context in the Thai corpus. Our framework employs the multi-passage BERT along with self-adjusting dice loss, which can help the model focuses more on the answer region of the context passage. We also show that there is an improvement in the performance when an auxiliary task is used. The experiment was conducted on the Thai Question Answering (QA) dataset used in Thailand National Software Competition. The results show that our method improves the model’s performance over a traditional BERT framework.

【 授权许可】

CC BY-NC-ND   

【 预 览 】
附件列表
Files Size Format View
RO202307090004791ZK.pdf 1685KB PDF download
  文献评价指标  
  下载次数:0次 浏览次数:0次