期刊论文详细信息
The international arab journal of information technology
A Novel Approach of Clustering Documents: Minimizing Computational Complexities in
article
Mohammed Alghobiri1  Khalid Mohiuddin1  Mohammed Abdul Khaleel2 
[1] Department of Management Information Systems King Khalid University;Department of Computer Science King Khalid University
关键词: Clustering algorithms;    document categorization;    document clustering;    hamiltonian graph;    similarity measure;   
DOI  :  10.34028/iajit/19/4/6
学科分类:计算机科学(综合)
来源: Zarqa University
PDF
【 摘 要 】

This study addresses the real-time issue of managing an academic program's documents in a universityenvironment. In practice, document classification from a corpus is challenging when the dataset size is large, and thecomplexity increases if to meet some specific document management requirements. This study presents a practical approach togrouping documents based on a content similarity measure. The approach analyzes the state-of-the-art clustering algorithmsperformance, considers Hamiltonian graph properties and a distance function. The distance function measures (1) the contentsimilarity between the documents and (2) the distances between the produced clusters. The proposed algorithm improvesclusters’ quality by applying Hamiltonian graph properties. One of the significant characteristics of the proposed function isthat it determines document types from the corpus. Hence, this does not require the initial assumption of cluster number beforethe algorithm execution. This approach omits the arbitrary primordial option of k-centroids of the k-means algorithm, reducescomputational complexities, and overcomes some limitations of commonly practicing clustering algorithms. The proposedapproach enables an effective way of document organization opportunities to the information systems developers whendesigning document management systems.

【 授权许可】

Unknown   

【 预 览 】
附件列表
Files Size Format View
RO202307090002522ZK.pdf 1102KB PDF download
  文献评价指标  
  下载次数:2次 浏览次数:2次