Twinner: A framework for automated software deobfuscation

Document Type : Research Article

Authors

Department of Computer Engineering, Sharif University of Technology, Tehran, P.O. Box 11155/1639, Iran

10.24200/sci.2019.21601

Abstract

Malware analysis is essential to understanding the internal logic and intent of malware programs in order to mitigate their threats. As the analysis methods have evolved, malware authors have adopted more techniques such as the virtualization obfuscation to protect the malware inner workings. This manuscript presents a framework for deobfuscating software which abstracts the input program as much as a mathematical model of its behavior, through monitoring every single operation performed during the malware execution. Also
the program is guided to run through its dierent execution paths automatically in order to gather as much knowledge as possible in the shortest time span. This makes it possible to nd hidden logics and deobfuscate dierent obfuscation techniques without being dependent on their specic details. The resulting model is then recoded as a C program without the articially added complexities. This code is called a twincode and behaves in the same manner as the obfuscated binary. As a proof of concept, the proposed framework is implemented and its eectiveness is evaluated on obfuscated binaries. Program control flow graphs are
inspected as a measure of successful code recovery. The performance of the proposed framework is evaluated using the set of SPEC test programs.

Keywords

References

References:

1. Global Research & Analysis Team (GReAT), "Equation group: Questions and answers", Kaspersky Labs, https://securelist.com/files/2015/02/Equation group questions and answers.pdf Online. Retrieved on 25th Feb 2015.
2. sKyWIper Analysis Team "skywiper: A complex malware for targeted attacks", Tech. Rep., Laboratory of Cryptography and System Security (CrySyS Lab), Budapest University of Technology and Economics (2012).
3. Rolles, R. "Unpacking virtualization obfuscators", 3rd USENIX Conference on Offensive Technologies, USENIX Association, pp. 1-1 (2009).
4. Kinder, J. "Towards static analysis of virtualizationobfuscated binaries", 19th Working Conference on Reverse Engineering (WCRE), IEEE, pp. 61-70 (2012).
5. Yadegari, B., Johannesmeyer, B., Whitely, B., et al. "A generic approach to automatic deobfuscation of executable code", 2015 IEEE Symposium on Security and Privacy, IEEE, pp. 674-691 (2015).
6. Sharif, M., Lanzi, A., Giffin, J., et al. "Automatic reverse engineering of malware emulators", 30th IEEE Symposium on Security and Privacy, IEEE, pp. 94-109 (2009).
7. Newsome, J., Karp, B., and Song, D. "Polygraph: Automatically generating signatures for polymorphic worms", IEEE Symposium on Security and Privacy, IEEE, pp. 226-241 (2005).
8. Kang, M.G., Poosankam, P., and Yin, H. "Renovo: A hidden code extractor for packed executables", 2007 ACM Workshop on Recurring Malcode, ACM, pp. 46- 53 (2007).
9. Raber, J. and Laspe, E. "Deobfuscator: An automated approach to the identification and removal of code obfuscation", 14th Working Conference on Reverse Engineering, WCRE'07, IEEE Computer Society, pp. 275-276 (2007).
10. Coogan, K., Lu, G., and Debray, S. "Deobfuscation of virtualization-obfuscated software: a semantics-based approach", 18th ACM Conference on Computer and Communications Security, ACM, pp. 275-284 (2011).
11. Luk, C.K., Cohn, R., Muth, R., et al. "Pin: Building customized program analysis tools with dynamic instrumentation", ACM SIGPLAN Notices, 40(6), pp. 190-200 (2005).
12. Sen, K. "Concolic testing", 22nd IEEE/ACM International Conference on Automated Software Engineering, ACM, pp. 571-572 (2007).
13. Sen, K. "Concolic testing: a decade later (keynote)", 13th International Workshop on Dynamic Analysis, ACM, pp. 1-1 (2015).
14. Brumley, D., Jager, I., Avgerinos, T., et al. "BAP: A binary analysis platform", Computer Aided Verification, Springer, pp. 463-469 (2011).
15. Shoshitaishvili, Y., Wang, R., Salls, C., et al. "Sok: (state of) the art of war: Offensive techniques in binary analysis", IEEE Symposium on Security and Privacy (SP), IEEE, pp. 138-157 (2016).
16. Peng, F., Deng, Z., Zhang, X., et al. "X-force: Force- executing binary programs for security applications", 2014 USENIX Security Symposium, San Diego, CA, August (2014).
17. Graziano, M., Balzarotti, D., and Zidouemba, A. "ROPMEMU: A framework for the analysis of complex code-reuse attacks", 11th ACM on Asia Conference on Computer and Communications Security, ACM, pp. 47-58 (2016).
18. Vanegue, J. "The weird machines in proof-carrying code", Security and Privacy Workshops (SPW), 2014 IEEE, IEEE, pp. 209-213 (2014).
19. Shapiro, R., Bratus, S., and Smith, S.W. ""Weird machines", in ELF: A Spotlight on the Underappre- ciated Metadata", WOOT'13: Presented as part of the 7th USENIX Workshop on Offensive Technologies, USENIX (2013).
20. Bangert, J., Bratus, S., Shapiro, R., et al. "The page-fault weird machine: lessons in instruction-less computation", 7th USENIX Workshop on Offensive Technologies, WOOT'13, USENIX (2013).
21. Barrett, C. "SMT: Where do we go from here?", 12th International Workshop on Satisfiability Modulo Theories, Available at: http://smt2014.it.uu.se/ (2014).
22. Fraser, G. and Arcuri, A. "Evosuite: automatic test suite generation for object-oriented software", 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering, ACM, pp. 416-419 (2011).
23. Shahamiri, S.R., Kadir, W.M.N.W., Ibrahim, S., et al. "An automated framework for software test oracle", Information and Software Technology, 53(7), pp. 774- 788 (2011).
24. Vanegue, J., Heelan, S., and Rolles, R. "SMT solvers in software security", WOOT, pp. 85-96 (2012).
25. http://ce.sharif.edu/b momeni/projects/twinner 26. Momeni, B. and Kharrazi, M. "LDMBL: An architecture for reducing code duplication in heavyweight binary instrumentations", Software: Practice and Experience, 48(9), pp. 1642-1659 (2018).
27. Intel Corporation "Intel®64 and IA-32 architectures software developer's manual, combined volumes 1, 2ABC, 3ABC", Intel Corporation (2013).
28. Barrett, C., Conway, C.L., Deters, M., et al. "CVC4", Computer Aided Verification, Springer, pp. 171-177 (2011).
29. SPEC "SPEC CINT2006 benchmarks", Standard Performance Evaluation Corporation, https://spec.org/ cpu2006/CINT2006/ (2006).
30. Bletsch, T., Jiang, X., Freeh, V.W., et al. "Jumporiented programming: a new class of code-reuse attack", 6th ACM Symposium on Information, Computer and Communications Security, ACM, pp. 30-40 (2011).
31. Lattner, C. and Adve, V. "LLVM: A compilation framework for lifelong program analysis & transformation", International Symposium on Code Generation and Optimization, CGO, IEEE, pp. 75-86 (2004).
32. Le Cam, L. "The central limit theorem around 1935", Statistical Science, 1(1), pp. 78-91, Institute of Mathematical Statistics (Feb. 1986).
33. Graziano, M., Leita, C., and Balzarotti, D. "Towards network containment in malware analysis systems", 28th Annual Computer Security Applications Conference, ACM, pp. 339-348 (2012).
34. Dinaburg, A., Royal, P., Sharif, M., et al. "Ether: malware analysis via hardware virtualization extensions", 15th ACM Conference on Computer and Communications Security, ACM, pp. 51-62 (2008).

Scientia Iranica

Volume 26, Special Issue on machine learning, data analytics, and advanced optimization techniques...
Transactions on Computer Science & Engineering and Electrical Engineering (D)
November and December 2019
Pages 3485-3509

Files

Cited by

History

Receive Date: 29 July 2017
Revise Date: 06 October 2018
Accept Date: 12 October 2019

Share

How to cite

Statistics

Article View: 875
PDF Download: 1,077