| IEEE Access | 卷:10 |
| RESHAPE: Reverse-Edited Synthetic Hypotheses for Automatic Post-Editing | |
| Baikjin Jung1  Jong-Hyeok Lee1  Wonkee Lee1  Jaehun Shin1  | |
| [1] Department of Computer Science and Engineering, POSTECH, Pohang, Republic of Korea; | |
| 关键词: Automatic post-editing; back-translation; decoding strategy; machine translation; synthetic data generation; | |
| DOI : 10.1109/ACCESS.2022.3154768 | |
| 来源: DOAJ | |
【 摘 要 】
Synthetic training data has been extensively used to train Automatic Post-Editing (APE) models in many recent studies because the quantity of human-created data has been considered insufficient. However, the most widely used synthetic APE dataset, eSCAPE, overlooks respecting the minimal editing property of genuine data, and this defect may have been a limiting factor for the performance of APE models. This article suggests adapting back-translation to APE to constrain edit distance, while using stochastic sampling in decoding to maintain the diversity of outputs, to create a new synthetic APE dataset,
【 授权许可】
Unknown