Zhang, Qinghua ; S. Purushothaman Iyer, Committee Member,Peng Ning, Committee Member,Wenye Wang, Committee Member,Douglas S. Reeves, Committee Chair,Zhang, Qinghua ; S. Purushothaman Iyer ; Committee Member ; Peng Ning ; Committee Member ; Wenye Wang ; Committee Member ; Douglas S. Reeves ; Committee Chair
Software attacks are a serious problem. Conventional anti-malware software expects malicious software, malware, to contain fixed and known code. Malware writers have devised methods of concealing or constantly changing their attacks to evade anti-malware software. Two importantrecent techniques are polymorphism, which makes uses ofcode encryption, and metamorphism, which uses a variety ofcode obfuscation techniques. This dissertation presents three newtechniques for detection of these malware.The first technique is to recognize polymorphicmalware that are encrypted and that self-decrypt before launching theattacks in network traffic. We propose a newapproach that combines static analysis and instruction emulationtechniques to more accurately identify the starting location andinstructions of the decryption routine, which is characteristic ofsuch malware, even if self-modifying code is used. This method hasbeen implemented and tested on current polymorphic exploits,including ones generated by state-of-the-art polymorphic engines.All exploits have been detected (i.e., a 100% detection rate),including those for which the decryption routine is dynamicallycoded or self-modifying. The method has also been tested on benignnetwork traffic and Windows executables. The false positive rates areapproximately .0002% and .01% for these two categories, respectively.Running time is approximately linear in the size of the networkpayload being analyzed and is between 1 and 2 MB/s.The second technique is a means of recognizing metamorphicmalware which has a transformed program image with equivalent orupdated functionalities. We propose a new approach that uses fullyautomated static analysis of executables to summarize and compareprogram semantics, based primarily on the pattern of library orsystem functions which are called. This method has been prototypedand evaluated using randomized benchmark programs, instances ofknown malware program variants, and utility software available inmultiple releases. The results demonstrate three importantcapabilities of the proposed method: (a) it does well atidentifying metamorphic variants of common malware. (b) itdistinguishes easily between programs that are not related and,(c) it can identify and detect program variations, or codereuse. Such variations can be due to the insertion of malware (such asviruses) into the executable of a host program.The third technique improvesthe applicability of a semantic metamorphic malware detectorwhich is the second technique of this dissertation. We proposeanautomated approach to generate common malware behavior patterns fordetection of metamorphic malware or new malware instances. Thismethod combines static analysis and data-mining techniques.This method has been prototyped and evaluated on real world malicious bot softwareand benign Windows programs. Through the experimental comparison with themetamorphic malware detector, this method results in an about 80% reductionin semantic pattern population todetect known and new malware instances.It is more robust to a junk behavior pollution attack than themalware detector is.A setof experiments was performed to test the quality of the commonbehavior patterns which were generated with different parameterconfigurations. Two optimized common behavior patterns wereobtained. The corresponding detection rates and true false positiverates are 94%, 8.3%, and 78%, 0.32% respectively. According to a recent paper [1],for indirect comparison and simple reference, the values ofthe two detection rates which are 94% and 78% more than double thedetection rate of signature-based methods on unknown malwareprograms, which is 33.75%.