期刊论文详细信息
Social Sciences
Hierarchical and Non-Hierarchical Linear and Non-Linear Clustering Methods to “Shakespeare Authorship Question”
Refat Aljumily1 
[1] School of English Literature, Language and Linguistics, University of Newcastle, Newcastle upon Tyne, Tyne and Wear NE1 7RU, UK; E-Mail
关键词: stylometry;    text-length normalization;    dimensionality-reduction;    dendrogram;    word bi-grams;    character triple-grams;    correlation matrix;    centroid analysis;    clustering tendency test;    vector space;   
DOI  :  10.3390/socsci4030758
来源: mdpi
PDF
【 摘 要 】

A few literary scholars have long claimed that Shakespeare did not write some of his best plays (history plays and tragedies) and proposed at one time or another various suspect authorship candidates. Most modern-day scholars of Shakespeare have rejected this claim, arguing that strong evidence that Shakespeare wrote the plays and poems being his name appears on them as the author. This has caused and led to an ongoing scholarly academic debate for quite some long time. Stylometry is a fast-growing field often used to attribute authorship to anonymous or disputed texts. Stylometric attempts to resolve this literary puzzle have raised interesting questions over the past few years. The following paper contributes to “the Shakespeare authorship question” by using a mathematically-based methodology to examine the hypothesis that Shakespeare wrote all the disputed plays traditionally attributed to him. More specifically, the mathematically based methodology used here is based on Mean Proximity, as a linear hierarchical clustering method, and on Principal Components Analysis, as a non-hierarchical linear clustering method. It is also based, for the first time in the domain, on Self-Organizing Map U-Matrix and Voronoi Map, as non-linear clustering methods to cover the possibility that our data contains significant non-linearities. Vector Space Model (VSM) is used to convert texts into vectors in a high dimensional space. The aim of which is to compare the degrees of similarity within and between limited samples of text (the disputed plays). The various works and plays assumed to have been written by Shakespeare and possible authors notably, Sir Francis Bacon, Christopher Marlowe, John Fletcher, and Thomas Kyd, where “similarity” is defined in terms of correlation/distance coefficient measure based on the frequency of usage profiles of function words, word bi-grams, and character triple-grams. The claim that Shakespeare authored all the disputed plays traditionally attributed to him is falsified in favor of the alternative authors according to the stylistic criteria and analytic methodology used. The result of this validated analysis is empirically-based, objective, and involves replicable evidence which can be used in conjunction with existing arguments to resolve the question of whether or not Shakespeare of Stratford-upon-Avon wrote all the disputed plays traditionally attributed to him.

【 授权许可】

CC BY   
© 2015 by the author; licensee MDPI, Basel, Switzerland.

【 预 览 】
附件列表
Files Size Format View
RO202003190006175ZK.pdf 750KB PDF download
  文献评价指标  
  下载次数:4次 浏览次数:31次