| International Journal of Computers Communications & Control | |
| Evaluation of Language Models on Romanian XQuAD and RoITD datasets | |
| article | |
| Constantin Dragos Nicolae1  Rohan Kumar Yadav2  Dan Tufiş1  | |
| [1] Research Institute for Artificial Intelligence;Oslo | |
| 关键词: NLP; Question Answering; RoBert; RoGPT; DistilBert; Transformer; | |
| DOI : 10.15837/ijccc.2023.1.5111 | |
| 学科分类:自动化工程 | |
| 来源: Universitatea Agora | |
PDF
|
|
【 摘 要 】
Natural language processing (NLP) has become a vital requirement in a wide range of applications, including machine translation, information retrieval, and text classification. The development and evaluation of NLP models for various languages have received significant attention in recent years, but there has been relatively little work done on comparing the performance of different language models on Romanian data. In particular, the introduction and evaluation of various Romanian language models with multilingual models have barely been comparatively studied. In this paper, we address this gap by evaluating eight NLP models on two Romanian datasets, XQuAD and RoITD. Our experiments and results show that bert-base-multilingual-cased and bertbase- multilingual-uncased, perform best on both XQuAD and RoITD tasks, while RoBERT-small model and DistilBERT models perform the worst. We also discuss the implications of our findings and outline directions for future work in this area.
【 授权许可】
CC BY-NC
【 预 览 】
| Files | Size | Format | View |
|---|---|---|---|
| RO202307150001103ZK.pdf | 2341KB |
PDF