Semantics-Aware Malware Detection

For concerns of intellectual property, software companies generally distribute their applications and COTS components in binary form-without their original source-code implementation. Analyzing software binaries is important for a number of reasons including quality assurance, software protection, and reverse engineering. For this contract, GrammaTech leveraged the success of CodeSonar®/C to produce a binary-analysis tool – CodeSonar/SWYX – that can satisfy these needs. The mnemonic SWYX (See What You eXecute) captures the notion that the analysis is performed on the actual binary code that executes, rather than the source-code version that is later translated by the compiler. (Analyzing the code that executes may be preferable even if source code were available.)

A salient feature of CodeSonar/SWYX is the use of TSL – Transformer Specification Language – for defining the semantics of machine-code instructions. TSL provides an architecture-independent way to specify analyses: it enables an analysis, defined only once, to be applicable to machine code of any computer architecture for which the semantics of the instruction set is defined. Thus, our approach to the implementation of CodeSonar/SWYX is expected to produce vulnerability detection for additional architectures with a lesser investment than would be required to produce such a tool in ad-hoc manner. Our use of TSL is the result of a close collaboration with colleagues at the University of Wisconsin-Madison.

CodeSonar/SWYX can detect familiar vulnerabilities, such as buffer overruns, null pointer dereferences, division by zero, and unreachable code.

Additionally, tight integration with the CodeSonar/C COTS product gives CodeSonar/SWYX the ability to analyze projects comprising binary and source-code components. In particular, the reuse of CodeSonar/C's models of library functions enables CodeSonar/SWYX to detect vulnerabilities due to misuse of common API, such as double-frees, uses of memory after it has been freed, and file-access race conditions. More generally, any project relying on a COTS component provided as a library, may benefit from the mixed-language capability.


Areas | Products | Sponsors | Publications | News | About Us © 2007-2012, GrammaTech, Inc. All rights reserved.
The Synthesizer Generator, Ada-ASSURED, Ada-Utilities, and SmashProof are trademarks of GrammaTech, Inc. CodeSurfer and CodeSonar are registered trademarks of GrammaTech, Inc.