期刊论文

【摘要】

Tackling binary program analysis problems has traditionally implied manually defining rules and heuristics, a tedious and time consuming task for human analysts. In order to improve automation and scalability, we propose an alternative direction based on distributed representations of binary programs with applicability to a number of downstream tasks. We introduce Bin2vec, a new approach leveraging Graph Convolutional Networks (GCN) along with computational program graphs in order to learn a high dimensional representation of binary executable programs. We demonstrate the versatility of this approach by using our representations to solve two semantically different binary analysis tasks – functional algorithm classification and vulnerability discovery. We compare the proposed approach to our own strong baseline as well as published results, and demonstrate improvement over state-of-the-art methods for both tasks. We evaluated Bin2vec on 49191 binaries for the functional algorithm classification task, and on 30 different CWE-IDs including at least 100 CVE entries each for the vulnerability discovery task. We set a new state-of-the-art result by reducing the classification error by 40% compared to the source-code based inst2vec approach, while working on binary code. For almost every vulnerability class in our dataset, our prediction accuracy is over 80% (and over 90% in multiple classes).

【授权许可】

CC BY

【预览】

附件列表
Files	Size	Format	View
RO202108110000089ZK.pdf	1326KB	PDF	download

Cybersecurity
Bin2vec: learning representations of binary executable programs for security tasks
article
Arakelyan, Shushan¹ Arasteh, Sima¹ Hauser, Christophe¹ Kline, Erik¹ Galstyan, Aram¹
[1] Information Sciences Institute
关键词: Binary program analysis; Computer security; Vulnerability discovery; Neural networks;
DOI : 10.1186/s42400-021-00088-4
学科分类：社会科学、人文和艺术（综合）
来源: Springer
PDF


	文献评价指标
	下载次数：18次	浏览次数：2次

【 摘 要 】

【 授权许可】

【 预 览 】

【摘要】

【授权许可】

【预览】