期刊论文详细信息
Crime Science
Supporting crime script analyses of scams with natural language processing
Research
Daniel Birks1  Zeya Lwin Tun2 
[1] School of Law, University of Leeds, Leeds, UK;School of Mathematics, University of Leeds, Leeds, UK;
关键词: Scams;    Crime;    Policing;    Crime script analysis;    Unstructured data;    Natural language processing;    Term frequency-inverse document frequency;    Doc2Vec;   
DOI  :  10.1186/s40163-022-00177-w
 received in 2022-02-01, accepted in 2022-11-12,  发布年份 2022
来源: Springer
PDF
【 摘 要 】

In recent years, internet connectivity and the ubiquitous use of digital devices have afforded a landscape of expanding opportunity for the proliferation of scams involving attempts to deceive individuals into giving away money or personal information. The impacts of these schemes on victims have shown to encompass social, psychological, emotional and economic harms. Consequently, there is a strong rationale to enhance our understanding of scams in order to devise ways in which they can be disrupted. One way to do so is through crime scripting, an analytical approach which seeks to characterise processes underpinning crime events. In this paper, we explore how Natural Language Processing (NLP) methods might be applied to support crime script analyses, in particular to extract insights into crime event sequences from large quantities of unstructured textual data in a scalable and efficient manner. To illustrate this, we apply NLP methods to a public dataset of victims’ stories of scams perpetrated in Singapore. We first explore approaches to automatically isolate scams with similar modus operandi using two distinct similarity measures. Subsequently, we use Term Frequency-Inverse Document Frequency (TF-IDF) to extract key terms in scam stories, which are then used to identify a temporal ordering of actions in ways that seek to characterise how a particular scam operates. Finally, by means of a case study, we demonstrate how the proposed methods are capable of leveraging the collective wisdom of multiple similar reports to identify a consensus in terms of likely crime event sequences, illustrating how NLP may in the future enable crime preventers to better harness unstructured free text data to better understand crime problems.

【 授权许可】

CC BY   
© The Author(s) 2023

【 预 览 】
附件列表
Files Size Format View
RO202305158450791ZK.pdf 3798KB PDF download
Fig. 11 426KB Image download
Fig. 12 108KB Image download
Fig. 1 85KB Image download
40517_2023_248_Article_IEq3.gif 1KB Image download
40517_2023_248_Article_IEq5.gif 1KB Image download
40517_2023_248_Article_IEq7.gif 1KB Image download
40517_2023_248_Article_IEq13.gif 1KB Image download
40517_2023_248_Article_IEq16.gif 1KB Image download
40517_2023_248_Article_IEq19.gif 1KB Image download
40517_2023_248_Article_IEq25.gif 1KB Image download
40517_2023_248_Article_IEq29.gif 1KB Image download
40517_2023_248_Article_IEq31.gif 1KB Image download
40517_2023_248_Article_IEq34.gif 1KB Image download
Fig. 1 354KB Image download
MediaObjects/12888_2023_4613_MOESM1_ESM.docx 17KB Other download
40517_2023_248_Article_IEq40.gif 1KB Image download
Fig. 5 3721KB Image download
Fig. 4 699KB Image download
MediaObjects/41408_2023_791_MOESM6_ESM.xlsx 11KB Other download
【 图 表 】

Fig. 4

Fig. 5

40517_2023_248_Article_IEq40.gif

Fig. 1

40517_2023_248_Article_IEq34.gif

40517_2023_248_Article_IEq31.gif

40517_2023_248_Article_IEq29.gif

40517_2023_248_Article_IEq25.gif

40517_2023_248_Article_IEq19.gif

40517_2023_248_Article_IEq16.gif

40517_2023_248_Article_IEq13.gif

40517_2023_248_Article_IEq7.gif

40517_2023_248_Article_IEq5.gif

40517_2023_248_Article_IEq3.gif

Fig. 1

Fig. 12

Fig. 11

【 参考文献 】
  • [1]
  • [2]
  • [3]
  • [4]
  • [5]
  • [6]
  • [7]
  • [8]
  • [9]
  • [10]
  • [11]
  • [12]
  • [13]
  • [14]
  • [15]
  • [16]
  • [17]
  • [18]
  • [19]
  • [20]
  • [21]
  • [22]
  • [23]
  • [24]
  • [25]
  • [26]
  • [27]
  • [28]
  • [29]
  • [30]
  • [31]
  • [32]
  • [33]
  • [34]
  • [35]
  • [36]
  • [37]
  • [38]
  • [39]
  • [40]
  • [41]
  • [42]
  • [43]
  • [44]
  • [45]
  • [46]
  • [47]
  • [48]
  • [49]
  • [50]
  • [51]
  文献评价指标  
  下载次数:3次 浏览次数:3次