期刊论文详细信息
Information
A Machine Learning-Based Method for Content Verification in the E-Commerce Domain
Nikolaos Peppes1  Theodoros Alexakis1  Konstantinos Demestichas1  Evgenia Adamopoulou1 
[1] Institute of Communication and Computer Systems, National Technical University of Athens, 15773 Athens, Greece;
关键词: machine learning (ML);    deep learning (DL);    person similarity;    content verification;    feature importance;    person fusion;   
DOI  :  10.3390/info13030116
来源: DOAJ
【 摘 要 】

Analysis of extreme-scale data is an emerging research topic; the explosion in available data raises the need for suitable content verification methods and tools to decrease the analysis and processing time of various applications. Personal data, for example, are a very valuable source of information for several purposes of analysis, such as marketing, billing and forensics. However, the extraction of such data (referred to as person instances in this study) is often faced with duplicate or similar entries about persons that are not easily detectable by the end users. In this light, the authors of this study present a machine learning- and deep learning-based approach in order to mitigate the problem of duplicate person instances. The main concept of this approach is to gather different types of information referring to persons, compare different person instances and predict whether they are similar or not. Using the Jaro algorithm for person attribute similarity calculation and by cross-examining the information available for person instances, recommendations can be provided to users regarding the similarity or not between two person instances. The degree of importance of each attribute was also examined, in order to gain a better insight with respect to the declared features that play a more important role.

【 授权许可】

Unknown   

  文献评价指标  
  下载次数:0次 浏览次数:0次