Journal of Engineering Research | |
Strategicenhancementofthecollaborativeframework for novelty inretrieval fromdigital textual datacorpus by deploying DPSC and RBWM algorithms for forensic analysis | |
Shanmugam Mala Ganapathy Sankar1  Gowri Shanmugam2  | |
[1] Sathyabama University | |
关键词: Data management; document clustering; Google’s Crawler; preprocessing; semantic.; | |
DOI : | |
学科分类:社会科学、人文和艺术(综合) | |
来源: Kuwait University * Academic Publication Council | |
【 摘 要 】
This paper proposes two advanced algorithms embedded into an integrated system; one is a Dynamic Path Selection Clustering (DPSC) algorithm for the document clustering and the other is the Rearward Binary Window Match (RBWM) algorithm for the user’s search engine. The DPSC algorithm is derived from the concept of Google’s crawler technique implemented in offline processing and the RBWM algorithm for search engine is derived by utilizing the techniques of other search algorithms. The proposed system is being accomplished for giving an appropriate data structure to the input dataset content. The dataset used as input is the Enron dataset, which is large in volume and unstructured. The system is designed with the help of integrating all the individual and independent units into a system by bringing them under one frame and the units are data preprocessing, document clustering, mapping of clusters and search engine. This system, with fine refining integrated frame, would likely evidence in a better way, since simple definition of the system for data retrieval affects the consistency of irrelevant information retrieval for evidencing to be increased. Though there are plenty of existing systems in forensic department with only simple definition of search engines, without any other processes the irrelevancy in retrieval is seen to a larger extent. Consequently, a design of this integrated system, which is automated in process by using the above well defined configured units, is proposed. This systematic approach is for adequate use of digital textual evidences, which assists in quicker crime identification rate. The outcomes of the proposed system are analyzed by obtaining the precision and recall values and comparing them with the results of Metasearch engines like Dogpile and Metacrawler, to test the efficacy in retrieval rate.
【 授权许可】
Unknown
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
RO201912010158185ZK.pdf | 611KB | download |