As discussed in the previous post, the basic malware analysis method is Static Analysis. We would be talking about Static Analysis in deep and would be performing different steps on a live sample. Before we start analyzing the malware samples, it would be better than we understand what kind of information can be extracted during the analysis. The basic characteristics which can be identified during this phase are listed below:
- File Type Information
It can help to identify the type of file. Many times, malware deceive users by putting Word/PDF icon, while it is an executable file. By identifying the file signature and associated information like compiler and if the file is packed/compressed greatly helps in differentiating from the visual appearance of malware.
- Fingerprinting File:
Fingerprinting involves generating the hash of the sample. The generated hash can be referred later in the analysis to determine if the sample is identical to any of the reported malware in the community. This is not quite helpful as a small bit change in the code can alter the hash of a file. But, it is quite useful to perform comparative analysis when malware drops executables while execution. The hashes can be compared to identify if the same sample has been dropped or something different.
- AV Reputation
Once we have hashes from the previous stage, we can perform a lookup of generated hash with community AV engines to identify if the file has been reported earlier by any of the AV engines.
- Strings Identification
In this step, we try to look into the sample content to identify the readable strings. This is done to look for the activities being performed by the malware sample. It can provide information like the URL/IP address/Import/Exports etc. The major challenge comes when strings are shown because coders try to use obfuscation technique to avoid the detection.
- File Obfuscation
As discussed in the previous step, when coders used obfuscation technique to avoid detection, string utility tools fail to extract the content. Packers and Cryptor are used to hide the content.
Packers also are known as run time packers are self-unpacking executables. Packers were used to reduce the size by using compression technique. Considering the current era internet capability, this is not used anymore as there is a lot of support from higher bandwidth of internet and storage.
Cryptor uses encryption to hide the content making difficult for the analyst to reverse the malware. XOR is one of the easiest technique still widely used while creating malware.
Protector is another type of concept which is used by coders to obfuscate the malware. Mostly it includes Packer as well as Cryptor. It is used to ensure the file is tamper proof hence making difficult for reverse engineers.
- PE Headers
Portable Executable(PE) is the format for executables, DLLs, object code etc. in Windows 32 bit and 64 bit OS. Once the file is loaded into the memory, PE header provides the required information like functions/resource dependency, memory address etc. The analysis of PE headers can provide quite good information about the behavior of the sample.
- File Dependencies
- Imports
- Exports
- Compilation Time Stamp
- Resources
- Classification of Malware
In this step, we try to relate the sample under investigation to already known malware family or threat actor groups.
- Fuzzy Hashing is a technique used to identify similar samples. It involves comparing context triggered piecewise hashes to deduce similarity relation between the samples.
- Import Hashing is another technique used to compare the import hash which helps in identifying the sample and threat actor groups too.
- Section hashing can also help in identifying similar malware by comparing the section of the PE header.
Let’s start with the very simple sample. The malware sample is mentioned in the book and the samples can be found at this link. The provided samples are written for Windows XP but still valid for new versions. I have taken lab01-01.exe as the investigating sample. The snapshot of the analysis machine is below.
The below snapshots shows the file type information. MZ(4D 5A) bytes confirm it is PE file.
Copy MD5 hash and search in Virustotal to confirm if the same file has been reported malicious by any of the engines. The below snapshot confirms that 38 engines have flagged it as malicious.
By now it is clear that the sample is malicious in nature. To have a better understanding we can investigate the functions of the malware. Let’s check the human-readable text in the malware using strings utility. I have used PE Studio to perform below analysis. The string value shows that malware will try to create/copy/find file etc. Considering proper string output, the file seems not to be obfuscated. We can verify the same using ExeInfo.
Let’s go back to PE Studio and try to extract information from PE header. We can it requires only two libraries.
- kernel32.dll handles memory management, input/output operations and interrupts.
- msvcrt.dll is a module containing standard C functions like printf/memcpy etc. and is not reported to be malicious.
The below screenshot talks about the functions which are imported. PE Studio automatically highlights the blacklisted imports.
Section confirms the sample is not packed as there is not much difference in raw-size and virtual size. In the case of compressed samples, raw-size will be less whereas virtual-size would be high showing unpacking. The compiler time stamp is provided below. It is generally used to understand the attack campaign and correlate with samples in the wild.
Since it didn’t try to deceive the users, hence did not find anything under PE resources. Generally deceiving icons or decoy files are found in the resources tab.
I tried to find if the sample can be flagged based on PE import hash. I found the hash and verified which provided below information. The filename can be different as it searched using identical import hash (imphash) in the database and Lab07-03.exe has the same imphash.
After consolidating all the points we can say that the static behavior of the sample indicates that it is malicious. We would use the same sample in the next phase of basic dynamic analysis.
Plagiarism Score: 0% Calculated from SmallSEOTools
Leave a Reply