Statistical Analysis and Data Mining | |
APT malware static trace analysis through bigrams and graph edit distance | |
Bolton, Alexander D.1  Anderson-Cook, Christine M.2  | |
[1] Imperial College London Department of Mathematics London UK;Statistical Sciences Group, Los Alamos National Laboratory Los Alamos New Mexico | |
关键词: call graph; family detection; malware detection; random forest; simulated annealing; | |
DOI : 10.1002/sam.11346 | |
学科分类:社会科学、人文和艺术(综合) | |
来源: John Wiley & Sons, Inc. | |
【 摘 要 】
Research and business organizations are vulnerable to attack by malware, particularly advanced persistent threat malware tailored for a specific target. Malware identification is made more difficult because samples can be subtly altered to avoid detection by methods that check for an identical match to known code. Different versions of an original piece of malware form a malware family. When new malicious software is identified, reverse engineers seek to identify its origin and purpose. Knowing whether new malware is from a known family or a previously unobserved family aids the efficiency of reverse engineers. This article presents a three-stage method to classify new malware into a family by comparing its similarity to existing static traces, and assigning it to the most similar family. First, a fast filtering method creates a shortlist of samples with some similarity to the new malware, using a simple bigram comparison of the instructions. The second stage takes the call graph view of the shortlisted static traces and uses simulated annealing to estimate the graph edit distance, a measure of dissimilarity between graphs. Finally, a random forest classifier combines the previous two results to predict the family to which a new sample belongs. The paper also considers how to detect when malware is from a new family.
【 授权许可】
Unknown
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
RO201902183015963ZK.pdf | 64KB | download |