SEPLN'09 Workshop on Uncovering Plagiarism, Authorship and Social Software Misuse | |
Intrinsic Plagiarism Detection Using Character n-gram Profiles | |
Efstathios Stamatatos | |
Others : http://CEUR-WS.org/Vol-502/paper8.pdf PID : 2025 |
|
来源: CEUR | |
【 摘 要 】
The task of intrinsic plagiarism detection deals with cases where no reference corpus is available and it is exclusively based on stylistic changes or inconsistencies within a givendocument. In this paper a new method is presented that attempts to quantify the style variationwithin a document using character n-gram profiles and a style change function based on anappropriate dissimilarity measure originally proposed for author identification. In addition, wepropose a set of heuristic rules that attempt to detect plagiarism–free documents andplagiarized passages, as well as to reduce the effect of irrelevant style changes within adocument. The proposed approach is evaluated on the recently-available corpus of the 1st Int.Competition on Plagiarism Detection with promising results.
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
Intrinsic Plagiarism Detection Using Character n-gram Profiles | 719KB | download |