Crime Science | |
Supporting crime script analyses of scams with natural language processing | |
Research | |
Daniel Birks1  Zeya Lwin Tun2  | |
[1] School of Law, University of Leeds, Leeds, UK;School of Mathematics, University of Leeds, Leeds, UK; | |
关键词: Scams; Crime; Policing; Crime script analysis; Unstructured data; Natural language processing; Term frequency-inverse document frequency; Doc2Vec; | |
DOI : 10.1186/s40163-022-00177-w | |
received in 2022-02-01, accepted in 2022-11-12, 发布年份 2022 | |
来源: Springer | |
【 摘 要 】
In recent years, internet connectivity and the ubiquitous use of digital devices have afforded a landscape of expanding opportunity for the proliferation of scams involving attempts to deceive individuals into giving away money or personal information. The impacts of these schemes on victims have shown to encompass social, psychological, emotional and economic harms. Consequently, there is a strong rationale to enhance our understanding of scams in order to devise ways in which they can be disrupted. One way to do so is through crime scripting, an analytical approach which seeks to characterise processes underpinning crime events. In this paper, we explore how Natural Language Processing (NLP) methods might be applied to support crime script analyses, in particular to extract insights into crime event sequences from large quantities of unstructured textual data in a scalable and efficient manner. To illustrate this, we apply NLP methods to a public dataset of victims’ stories of scams perpetrated in Singapore. We first explore approaches to automatically isolate scams with similar modus operandi using two distinct similarity measures. Subsequently, we use Term Frequency-Inverse Document Frequency (TF-IDF) to extract key terms in scam stories, which are then used to identify a temporal ordering of actions in ways that seek to characterise how a particular scam operates. Finally, by means of a case study, we demonstrate how the proposed methods are capable of leveraging the collective wisdom of multiple similar reports to identify a consensus in terms of likely crime event sequences, illustrating how NLP may in the future enable crime preventers to better harness unstructured free text data to better understand crime problems.
【 授权许可】
CC BY
© The Author(s) 2023
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
RO202305158450791ZK.pdf | 3798KB | download | |
Fig. 11 | 426KB | Image | download |
Fig. 12 | 108KB | Image | download |
Fig. 1 | 85KB | Image | download |
40517_2023_248_Article_IEq3.gif | 1KB | Image | download |
40517_2023_248_Article_IEq5.gif | 1KB | Image | download |
40517_2023_248_Article_IEq7.gif | 1KB | Image | download |
40517_2023_248_Article_IEq13.gif | 1KB | Image | download |
40517_2023_248_Article_IEq16.gif | 1KB | Image | download |
40517_2023_248_Article_IEq19.gif | 1KB | Image | download |
40517_2023_248_Article_IEq25.gif | 1KB | Image | download |
40517_2023_248_Article_IEq29.gif | 1KB | Image | download |
40517_2023_248_Article_IEq31.gif | 1KB | Image | download |
40517_2023_248_Article_IEq34.gif | 1KB | Image | download |
Fig. 1 | 354KB | Image | download |
MediaObjects/12888_2023_4613_MOESM1_ESM.docx | 17KB | Other | download |
40517_2023_248_Article_IEq40.gif | 1KB | Image | download |
Fig. 5 | 3721KB | Image | download |
Fig. 4 | 699KB | Image | download |
MediaObjects/41408_2023_791_MOESM6_ESM.xlsx | 11KB | Other | download |
【 图 表 】
Fig. 4
Fig. 5
40517_2023_248_Article_IEq40.gif
Fig. 1
40517_2023_248_Article_IEq34.gif
40517_2023_248_Article_IEq31.gif
40517_2023_248_Article_IEq29.gif
40517_2023_248_Article_IEq25.gif
40517_2023_248_Article_IEq19.gif
40517_2023_248_Article_IEq16.gif
40517_2023_248_Article_IEq13.gif
40517_2023_248_Article_IEq7.gif
40517_2023_248_Article_IEq5.gif
40517_2023_248_Article_IEq3.gif
Fig. 1
Fig. 12
Fig. 11
【 参考文献 】
- [1]
- [2]
- [3]
- [4]
- [5]
- [6]
- [7]
- [8]
- [9]
- [10]
- [11]
- [12]
- [13]
- [14]
- [15]
- [16]
- [17]
- [18]
- [19]
- [20]
- [21]
- [22]
- [23]
- [24]
- [25]
- [26]
- [27]
- [28]
- [29]
- [30]
- [31]
- [32]
- [33]
- [34]
- [35]
- [36]
- [37]
- [38]
- [39]
- [40]
- [41]
- [42]
- [43]
- [44]
- [45]
- [46]
- [47]
- [48]
- [49]
- [50]
- [51]