Introduction to Malware Analysis
0x00
Malware Definition
Malware, short for malicious software, is a term encompassing various types of software of software designed to infiltrate, exploit, or damage computer systems.
Although all malware is utilized for malicious intents, the specific objectives of malware can vary among different threat actors. These objectives commonly fall into several categories:
- disrupting host system operations
- stealing critical information, including personal and financial data
- gaining unauthorized access to systems
- conducting espionage activities
- sending spam messages
- utilizing the victim’s system for DDoS attacks
- implementing ransomware to lock up victim’s files on their host and demanding ransom
Malware Types
Viruses
These notorious forms of malware are designed to infiltrate and multiply within host files, transitioning from one system to another. They latch onto credible programs, springing into action when the infected files are triggered. Their destructive powers can range from corrupting or altering data to disrupting system functions, and even spreading through networks, inflicting widespread havoc.
Worms
Worms are autonomous malware capable of multiplying across networks without needing human intervention. They exploit network weaknesses to infiltrate systems without permission. Once inside, they can either deliver damaging payloads or keep multiplying to other vulnerable devices. Worms can initiate swift and escalating infections, resulting in enormous disruption and even potential denial of service.
Trojans
Also known as Trojan Horses, these are disguised as genuine software to trick users into running them. Upon entering a system, they craft backdoors, allowing attackers to gain unauthorized control remotely. Trojans can be weaponized to pilfer sensitive data, such as passwords or financial information, and orchestrate other harmful activities on the compromised system.
Ransomware
This malicious type of malware encrypts files on the target’s system, making them unreachable. Attackers then demand a ransom in return for the decryption key, effectively holding the victim’s data to ransom. The impacts of ransomware attacks can debilitate organizations and individuals alike, leading to severe financial and reputational harm.
Spyware
This type of malware stealthily gathers sensitive data and user activities without their consent. It can track online browsing data habits, record keystrokes, and capture login credentials, posing a severe risk to privacy and security. The pilfered data is often sent to remote servers for harmful purposes.
Adware
Though not as destructive, adware can still be an annoyance and a security threat. It shows uninvited and invasive advertisements on infected systems, often resulting in a poor user experience. Adware may also track user behavior and collect data for targeted advertising.
Botnets
These are networks of compromised devices, often referred to as bots or zombies, controlled by a central C2 server. Botnets can be exploited for a variety of harmful activities, including launching DDoS attacks, spreading spam, or disseminating other malware.
Rootkits
These are stealthy forms of malware designed to gain unauthorized access and control over the fundamental components of an OS. They alter system functions to conceal their presence, making them extremely challenging to spot and eliminate. Attackers can utilize rootkits to maintain prolonged access and dodge security protocols.
Backdoors/RATs
Backdoors and RATs are crafted to offer unauthorized access and control over compromised systems from remote locations. Attackers can leverage them to retain prolonged control, extract data, or instigate additional attacks.
Droppers
These are a kind of malware used to transport and install extra malicious payloads onto infected systems. They serve as a conduit for other malware, ensuring the covert installation and execution of more sophisticated threats.
Information Stealers
These are tailored to target and extract sensitive data, like login credentials, personal information, or intellectual property, for harmful purposes. This includes identity theft or selling the data on the dark web.
Malware Samples
Resources:
- VirusShare
- Hybrid Analysis
- TheZoo
- Malware-Traffic-Analysis.net
- VirusTotal
- ANY.RUN
- Contagio Malware Dump
- VX Underground
Malware/Evidence Acquisition
When it comes to gathering evidence during a DFIR investigation or or incident response, having the right tools to perform disk imaging and memory acquisition is crucial.
Disk Imaging
- FTK Imager
- OSFClone
- DD and DCFLDD (command-line tools)
Memory Acquisition
Other Evidence Acquisition
Malware Analysis Definition, Purpose, & Common Activities
The process of comprehending the behavior and inner workings of malware is known as Malware Analysis, a crucial aspect of cybersecurity that aids in understanding the threat posed by malicious software and devising effective countermeasures.
Malware analysis serves several pivotal purposes, such as:
- Detection and Classification: Through analyzing malware, you can identify and categorize different types of threats based on their unique characteristics, signatures, or patterns. This enables you to develop detection rules and empowers security professionals to gain a comprehensive understanding of the nature of the malware they encounter.
- Reverse Engineering: Malware analysis often involves the intricate process of reverse engineering the malware’s code to discern its underlying operations and employed techniques. This can unveil concealed functionalities, encryption methods, details about the C2 infrastructure, and techniques used for obfuscation and evasion.
- Behaviorial Analysis: By meticulously studying the behavior of malware during execution, you gain insights into its actions, such as modifications to the file system, network communications, changes to the system registry, and attempts to exploit vulnerabilities. This analysis provides invaluable information about the impact of the malware on infected systems and assists in devising potential countermeasures.
- Threat Intelligence: Through malware analysis, threat researchers can amass critical intelligence about attackers, their tactics, techniques, and procedures (TTPs), and the malware’s origins. This valuable intelligence can be shared with the wider security community to enhance detection, prevention, and response capabilities.
The techniques employed in malware analysis encompass a wide array of methods and tools, including:
- Static Analysis: This approach involves scrutinizing the malware’s code without executing it, examining the file structure, identifying strings, searching for known signatures, and studying metadata to gain preliminary insights into the malware’s characteristics.
- Dynamic Analysis: Dynamic analysis entails executing the malware within a controlled environment, such as a sandbox or virtual machine, to observe its behavior and capture its runtime activities. This includes monitoring network traffic, system calls, file system modifications, and other interactions.
- Code Analysis: Code analysis involves disassembling or decompiling the malware’s code to understand its logic, functions, algorithms, and employed techniques. This helps in identifying concealed functionalities, exploitation methods, encryption methods, details about the C2 infrastructure, and techniques used for obfuscation and evasion. Inferentially, code analysis can also help in uncovering potential Indicators of Compromise.
- Memory Analysis: Analyzing the malware’s interactions with system memory helps in identifying injected code, hooks, or other runtime manipulations. This can be instrumental in detecting rootkits, analyzing anti-analysis techniques, or identifying malicious payloads.
- Malware Unpacking: This technique refers to the process of extracting and isolating the hidden malicious code within a piece of malware that uses packing techniques to evade detection. Packers are used by malware authors to compress, encrypt, or obfuscate their malicious code, making it harder for AV software and other security tools to identify the threat. Unpacking involves reverse-engineering these packing techniques to reveal the original, unobfuscated code for further analysis. This can allow researchers to understand the malware’s functionality, behavior, and potential impact.
Windows Internals
To conduct effective malware analysis, a profound understanding of Windows internals is essential. Windows operating systems function in two main modes:
- User Mode: This mode is where most applications and user processes operate. Applications in user mode have limited access to system resources and must interact with the OS through APIs. These processes are isolated from each other and cannot directly access hardware or critical system functions. However, in this mode, malware can still manipulate files, registry settings, network connections, and other user-accessible resources, and it may attempt to escalate privileges to gain more control over the system.
- Kernel Mode: In contrast, kernel mode is a highly privileged mode where the Windows kernel runs. The kernel has unrestricted access to system resources, hardware, and critical functions. It provides core OS services, manages system resources, and enforces security and stability. Device drivers, which facilitate communication with hardware devices, also run in kernel mode. If malware operates in kernel mode, it gains elevated control and can manipulate system behavior, conceal its presence, intercept system calls, and tamper with security mechanisms.
Windows Architecture at a High Level
The below image showcases a simplified version of Windows’ architecture.

The simplified Windows architecture comprises both user-mode and kernel-mode components, each with distinct responsibilities in the system’s functioning.
User-mode Components
… are those parts of the OS that don’t have direct access to hardware or kernel data structures. They interact with system resources through APIs and system calls.
- System Support Processes: These are essential components that provide crucial functionalities and services such as logon processes (winlogon.exe), Session Manager (smss.exe), and Service Control Manager (services.exe). These aren’t Windows service but they are necessary for the proper functioning of the system.
- Service Processes: These processes host Windows services like the Windows Update Service, Task Scheduler, and Print Spooler services. They usually run in the background, executing tasks according to their configuration and parameters.
- User Applications: These are the processes created by user programs, including both 32-bit and 64-bit applications. They interact with the OS through APIs provided by Windows. These API calls get redirected to NTDLL.DLL, triggering a transition from user mode to kernel mode, where the system call gets executed. The result is then returned to the user-mode application, and a transition back to user mode occurs.
- Environment Subsystems: These components are responsible for providing execution environments for specific types of applications or processes. They include the Win32 Subsystem, POSIX, and OS/2.
- Subsystem DLLs: These dynamic-link libraries translate documented functions into appropriate internal native system calls, primarily implemented in NTDLL.DLL. Examples include kernelbase.dll, user32.dll, wininet.dll, and advapi32.dll.
Kernel-mode Components
… are those parts of the OS that have direct access to hardware and kernel data structures.
- Executive: This upper layer in kernel mode gets accessed through functions from NTDLL.DLL. It consists of components like the I/O Manager, Object Manager, Security Reference Monitor, Process Manager, and others, managing the core aspects of the OS such as I/O operations, object management, security, and processes. It runs some checks first, and then passes the call to kernel, or calls the appropriate device driver to perform the requested operation.
- Kernel: This component manages system resources, providing low-level services like thread scheduling, interrupt and exception dispatching, and multiprocessor synchronization.
- Device Driver: These software components enable the OS to interact with hardware devices. They serve as intermediaries, allowing the system to manage and control hardware and software resources.
- Hardware Abstraction Layer (HAL): This component provides an abstraction layer between the hardware devices and the OS. It allows software developers to interact with hardware in a consistent and platform-independent manner.
- Windowing and Graphics System (Win32k.sys): This subsystem is responsible for managing the graphical user interface and rendering visual elements on the screen.
Windows API Call Flow
Malware often utilizes Windows API calls to interact with the system and carry out malicious operations. By understanding the internal details of API functions, their parameters, and expected behavior, analysts can identify suspicious or unauthorized API usage.
Consider an example of a Windows API call flow, where a user-mode application tries to access privileged operations and system resources using the ReadProcessMemory function. This function allows a process to read the memory of a different process.

When this function is called, some required parameters are also passed to it, such as the handle to the target process, the source address to read from, a buffer in its own memory space to store the read data, and the number of bytes to read. Below is the syntax of ReadProcessMemory WINAPI function as per Microsoft documentation.
BOOL ReadProcessMemory(
[in] HANDLE hProcess,
[in] LPCVOID lpBaseAddress,
[out] LPVOID lpBuffer,
[in] SIZE_T nSize,
[out] SIZE_T *lpNumberOfBytesRead
);
ReadProcessMemory is a Windows API function that belongs to the kernel32.dll library. So, this call is invoked via the kernel32.dll module which serves as the user mode interface to the Windows API. Internally, the kernel32.dll module interacts with the NTDLL.DLL module, which provides a lower-level interface to the Windows kernel. Then, this function request is translated to the corresponding Native API call, which is NtReadVirtualMemory. The below screenshot from x64dbg demonstrates how this looks like in a debugger.

The NTDLL.DLL module utilizes system calls (syscalls).

The syscall instruction triggers the system call using the parameters set in the previous instructions. It transfers control from user mode to kernel mode, where the kernel performs the requested operation after validating the parameters and checking the access rights of the calling process.
If the request is authorized, the thread is transitioned from user mode into the kernel mode. The kernel maintains a table known as the System Service Descriptor Table (SSDT) or the syscall table (System Call Table), which is a data structure that contains pointers to the various system service routines. These routines are responsible for handling system calls made by user-mode applications. Each entry in the syscall table corresponds to a specific system call number, and the associated pointer points to the corresponding kernel function that implements the requested operation.
The syscall responsible for ReadProcessMemory is executed in the kernel, where the Windows memory management and process isolation mechanisms are leveraged. The kernel performs necessary validations, access checks, and memory operations to read the memory from the target process. The kernel retrieves the physical memory pages corresponding to the requested virtual addresses and copies the data into the provided buffer.
Once the kernel has finished reading the memory, it transitions the thread back to user mode and control is handed back to the original user mode application. The application can then access the data that was read from the target process’s memory and continue its execution.
Portable Executables
Windows OS employ the Portable Executable (PE) format to encapsulate executable programs, DLLs, and other integral system components.
PE files accomodate a wide variety of data types including executables (.exe), dynamic link libraries (.dll), kernel modules (.srv), control panel applications (.cpl), and many more. The PE file format is fundamentally a data structure containing the vital information required for the Windows OS loader to manage the executable code, effectively loading it into memory.
PE Sections
The PE Structure also houses a Section Table, an element comprising several sections dedicated to distinct purposes. The sections are essentially the repositories where the actual content of the file, including the data, resources utilized by the program, and the executable code, is stored. The .text section is often under scrutiny for potential artifcats related to injection attacks.
Common PE sections include:
- Text Section (.text): The hub where the executable code of the program resides.
- Data Section (.data): A storage for initialized global and static data variables.
- Read-only initialized data (.rdata): Houses read-only data such as constant values, string literals, and initialized global and static variables.
- Exception information (.pdata): A collection of function table entries utilized for exception handling.
- BSS Section (.bss): Holds uninitialized global and static data variables.
- Resource Section (.rsrc): Safeguards resources such as images, icons, strings, and version information.
- Import Section (.idata): Details about functions imported from other DLLs.
- Export Section (.edata): Information about functions exported by the executable.
- Relocation Section (.reloc): Details for relocating the executable’s code and data when loaded at a different memory address.
You can visualize the sections of a portable executable using a tool like pestudio:

Processes
In the simplest terms, a process is an instance of an executing program. It represents a slice of a program’s execution in memory and consists of various resources, including memory, file handles, threads, and security contexts.

Each process is characterized by:
- A unique PID (Process Identifier): A unique PID is assigned to each process within the OS. This numeric identifier facilitates the tracking and management of the process by the OS.
- Virtual Address Space (VAS): In the Windows OS, every process is allocated its own virtual address space, offering a virtualized view of the memory for the process. The VAS is sectioned into segments, including code, data, and stack segments, allowing the process isolated memory access.
- Executable Code (Image File on Disk): The executable code, or the image film, signifies the binary executable file on the disk. It houses the instructions and resources necessary for the process to operate.
- Table of Handles to System Objects: Processes maintain a table of handles, a reference catalogue for various system objects. System objects can span files, devices, registry keys, synchronization objects, and other resources.
- Security Context (Access Token): Each process has a security context associated with it, embodied by an Access Token. This Access Token encapsulates information about the process’s security privileges, including the user account under which the process operates and the access rights granted to the process.
- One or More Threads Running in its Context: Processes consist of one or more threads, where a thread embodies a unit of execution within the process. Threads enable concurrent execution within the process and facilitate multitasking.
Dynamic-link library (DLL)
A Dynamic-link library (DLL) is a type of PE which represents “Microsoft’s implementation of the shared library concept in the Microsoft Windows OS”. DLLs expose an array of functions which can be exploited by malware.
Import Functions
- Import functions are functionalities that a binary dynamically links to from external libraries or modules during runtime. These functions enable the binary to leverage the functionalities offered by these libraries.
- During malware analysis, examining import functions may shed light on the external libraries or modules that the malware is dependent on. This information aids in identifying the APIs that the malware might interact with, and also the resources such as the file system, processes, registry etc.
- By identifying specific functions imported, it becomes possible to ascertain the actions the malware can perform, such as file operations, network communication, registry manipulation, and more.
- Import function names or hashes can serve as IOCs (Indicators of Compromise) that assist in identifying malware variants or related samples.
Below is an example of identifying process injection using DLL imports and function names:

In this diagram, the malware process (shell.exe) performs process injection to inject code into a target process (notepad.exe) using the following functions imported from the DLL kerne32.exe:
OpenProcess: Opens a handle to the target process, providing the necessary access rights to manipulate its memory.VirtualAllocEx: Allocates a block of memory within the address space of the target process to store the injected code.WriteProcessMemory: Writes the desired code into the allocated memory block of the target process.CreateRemoteThread: Creates a new thread within the target process, specifying the entry point of the injected code as the starting point.
As a result, the injected code is executed within the context of the target process by the newly created remote thread. This technique allows the malware to run arbitrary code within the target process.
note
The functions above are WINAPI (Windows API) functions.
You can examine the DLL imports of shell.exe using CFF Explorer as follows:

Export Functions
- Export functions are the functions that a binary exposes for use by other modules or applications.
- These functions provide an interface for other software to interact with the binary.
In the below screenshot, you can see an example of DLL imports and exports:
- Imports: This shows the DLLs and their functions imported by an executable Utilman.exe.
- Exports: This shows the functions exported by a DLL Kernel32.dll.

In the context of malware analysis, understanding import and export functions assist in discerning the behavior capabilities, and interactions of the binary with external entities. It yields valuable information for threat detection, classification, and gauging the impact of the malware on the system.
Static Analysis - Linux
In the realm of malware analysis, you exercise a method called static analysis to scrutinize malware without necessitating its execution. This involves the meticulous investigation of malware’s code, data, and structural components, serving as a vital precursor for further, more detailed analysis.
Through static analysis, you endeavor to extract pivotal information which includes:
- File type
- File hash
- Strings
- Embedded elements
- Packer information
- Imports
- Exports
- Assembly code

Identifying the File Type
Your first port of call in this stage is to ascertain the rudimentary information about the malware specimen to lay the groundwork for your investigation. Given that file extensions can be manipulated and changed, your task is to devise a method to identify the actual file type you are encountering. Establishing the file type plays an integral role in static analysis, ensuring that the procedures you apply are appropriate and the results obtained are accurate.
The command for checking the file type for a file called “Ransomware.wannacry.exe” would be:
d41y@htb[/htb]$ file /home/htb-student/Samples/MalwareAnalysis/Ransomware.wannacry.exe
/home/htb-student/Samples/MalwareAnalysis/Ransomware.wannacry.exe: PE32 executable (GUI) Intel 80386, for MS Windows
You can also do the same by manually checking the header with the help of the hexdump command as follows:
d41y@htb[/htb]$ hexdump -C /home/htb-student/Samples/MalwareAnalysis/Ransomware.wannacry.exe | more
00000000 4d 5a 90 00 03 00 00 00 04 00 00 00 ff ff 00 00 |MZ..............|
00000010 b8 00 00 00 00 00 00 00 40 00 00 00 00 00 00 00 |........@.......|
00000020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00000030 00 00 00 00 00 00 00 00 00 00 00 00 f8 00 00 00 |................|
00000040 0e 1f ba 0e 00 b4 09 cd 21 b8 01 4c cd 21 54 68 |........!..L.!Th|
00000050 69 73 20 70 72 6f 67 72 61 6d 20 63 61 6e 6e 6f |is program canno|
00000060 74 20 62 65 20 72 75 6e 20 69 6e 20 44 4f 53 20 |t be run in DOS |
00000070 6d 6f 64 65 2e 0d 0d 0a 24 00 00 00 00 00 00 00 |mode....$.......|
00000080 55 3c 53 90 11 5d 3d c3 11 5d 3d c3 11 5d 3d c3 |U<S..]=..]=..]=.|
00000090 6a 41 31 c3 10 5d 3d c3 92 41 33 c3 15 5d 3d c3 |jA1..]=..A3..]=.|
000000a0 7e 42 37 c3 1a 5d 3d c3 7e 42 36 c3 10 5d 3d c3 |~B7..]=.~B6..]=.|
000000b0 7e 42 39 c3 15 5d 3d c3 d2 52 60 c3 1a 5d 3d c3 |~B9..]=..R`..]=.|
000000c0 11 5d 3c c3 4a 5d 3d c3 27 7b 36 c3 10 5d 3d c3 |.]<.J]=.'{6..]=.|
000000d0 d6 5b 3b c3 10 5d 3d c3 52 69 63 68 11 5d 3d c3 |.[;..]=.Rich.]=.|
000000e0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000000f0 00 00 00 00 00 00 00 00 50 45 00 00 4c 01 04 00 |........PE..L...|
00000100 cc 8e e7 4c 00 00 00 00 00 00 00 00 e0 00 0f 01 |...L............|
00000110 0b 01 06 00 00 90 00 00 00 30 38 00 00 00 00 00 |.........08.....|
00000120 16 9a 00 00 00 10 00 00 00 a0 00 00 00 00 40 00 |..............@.|
00000130 00 10 00 00 00 10 00 00 04 00 00 00 00 00 00 00 |................|
00000140 04 00 00 00 00 00 00 00 00 b0 66 00 00 10 00 00 |..........f.....|
00000150 00 00 00 00 02 00 00 00 00 00 10 00 00 10 00 00 |................|
00000160 00 00 10 00 00 10 00 00 00 00 00 00 10 00 00 00 |................|
00000170 00 00 00 00 00 00 00 00 e0 a1 00 00 a0 00 00 00 |................|
00000180 00 00 31 00 54 a4 35 00 00 00 00 00 00 00 00 00 |..1.T.5.........|
00000190 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
On a Windows system, the presence of the ASCII string MZ at the start of a file denotes an executable file. MZ stands for Mark Zbikowksi, a key architect of MS-DOS.
Malware Fingerprinting
At this stage, your mission is to create a unique identifier for the malware sample. This typically takes the form of a cryptographic hash - MD5, SHA1, or SHA256.
Fingerprinting is employed for numerous purposes, encompassing:
- Identification and tracking of malware samples
- Scanning an entire system for the presence of identical malware
- Confirmation of previous encounters and analyses of the same malware
- Sharing with stakeholders as IoC or as part of threat intelligence reports
As an illustration, to check the MD5 file hash of the abovementioned malware the command would be the following.
d41y@htb[/htb]$ md5sum /home/htb-student/Samples/MalwareAnalysis/Ransomware.wannacry.exe
db349b97c37d22f5ea1d1841e3c89eb4 /home/htb-student/Samples/MalwareAnalysis/Ransomware.wannacry.exe
To check the SHA256 file has of the abovementioned malware the command would be the following.
d41y@htb[/htb]$ sha256sum /home/htb-student/Samples/MalwareAnalysis/Ransomware.wannacry.exe
24d004a104d4d54034dbcffc2a4b19a11f39008a575aa614ea04703480b1022c /home/htb-student/Samples/MalwareAnalysis/Ransomware.wannacry.exe
File Hash Lookup
The ensuing step involves checking the file hash produced in the prior step against online malware scanners and sandboxes such as Cuckoo sandbox. For instance, VirusTotal, an online malware scanning engine, which collaborates with various AV vendors, allows you to search for the file hash. This step aids you in comparing your result with existing knowledge about the malware sample.
The following image displays the results from VirusTotal after the SHA256 file hash of the abovementioned malware was submitted.

Even though a file hash like MD5, SHA1, or SHA256 is valuable for identifying identical samples with disparate names, it falls short when identifying similar malware samples. This is primarily because a malware author can alter the file hash value by making minor modifications to the code and recompiling it.
Nonetheless, there exist techniques that can aid in identifying similar samples:
Import Hashing (IMPHASH)
IMPHASH, an abbreviation for “Import Hash”, is a cryptographic hash calculated from the import functions of a Windows Portable Executable file. Its algorithm functions by first converting all imported function names to lowercase. Following this, the DLL names and function names are fused together and arranged in alphabetical order. Finally, an MD5 hash is generated from the resulting string. Therefore, two PE files with identical import functions, in the sam sequence, will share an IMPHASH value.
You can find the IMPHASH in the “Details” tab of the VirusTotal results.

Note that you can also use the pefile Python module to compute the IMPHASH of a file as follows.
# imphash_calc.py
import sys
import pefile
import peutils
pe_file = sys.argv[1]
pe = pefile.PE(pe_file)
imphash = pe.get_imphash()
print(imphash)
To check the IMPHASH of the abovementioned WannaCry malware the command would be the following.
d41y@htb[/htb]$ python3 imphash_calc.py /home/htb-student/Samples/MalwareAnalysis/Ransomware.wannacry.exe
9ecee117164e0b870a53dd187cdd7174
Fuzzy Hashing (SSDEEP)
Fuzzy Hashing (SSDEEP), also reffered to as context-triggered piecewise hashing, is a hashing technique designed to compute a hash value indicative of content similiarity between two files. This technique dissects a file into smaller, fixed-size blocks and calculates a hash for each block. The resulting hash values are then consolidated to generate the final fuzzy hash.
The SSDEEP algorithm allocates more weight to longer sequences of common blocks, making it highly effective in identifying files that have undergone minor modifications, or are similar but not identical, such as different variations of a malicious sample.
You can find the SSDEEP hash of a malware in the “Details” tab of the VirusTotal results.
You can also use the ssdeep command to calculate the SSDEEP hash of a file. To check the SSDEEP hash of the abovementioned WannaCry malware the command would be the following.
d41y@htb[/htb]$ ssdeep /home/htb-student/Samples/MalwareAnalysis/Ransomware.wannacry.exe
ssdeep,1.1--blocksize:hash:hash,filename
98304:wDqPoBhz1aRxcSUDk36SAEdhvxWa9P593R8yAVp2g3R:wDqPe1Cxcxk3ZAEUadzR8yc4gB,"/home/htb-student/Samples/MalwareAnalysis/Ransomware.wannacry.exe"

The command line arguments -pb can be used to initiate matching mode in SSDEEP.
d41y@htb[/htb]$ ssdeep -pb *
potato.exe matches svchost.exe (99)
svchost.exe matches potato.exe (99)
-p denotes Pretty matching mode, and -b is used to display only the file names, excluding full paths.
In the example above, a 99% similarity was observed between two malware samples using SSDEEP.
Section Hashing (Hashing PE Sections)
Section Hashing (hashing PE sections) is a powerful technique that allows analysts to identify sections of a PE file that have been modified. This can be particularly useful for identifying minor variations in malware samples, a common tactic employed by attackers to evade detection.
The Section Hashing technique works by calculating the cryptographic hash of each of these sections. When comparing two PE files, if the hash of corresponding sections in the two files matches, it suggests that the particular section has not been modified between the two versions of the file.
By applying section hashing, security analysts can identify parts of a PE file that have been tampered with or altered. This can help identify similar malware samples, even if they have been slightly modified to evade traditional signature-based detection methods.
Tools such as pefile in Python can be used to perform section hashing. In Python, for example, you can use the pefile module to access and hash the data in individual sections of a PE file as follows.
# section_hashing.py
import sys
import pefile
pe_file = sys.argv[1]
pe = pefile.PE(pe_file)
for section in pe.sections:
print (section.Name, "MD5 hash:", section.get_hash_md5())
print (section.Name, "SHA256 hash:", section.get_hash_sha256())
Remember that while section hashing is a poweful technique, it is not foolproof. Malware authors might employ tactics like section name obfuscation or dynamically generating section names to try and bypass this kind of analysis.
d41y@htb[/htb]$ python3 section_hashing.py /home/htb-student/Samples/MalwareAnalysis/Ransomware.wannacry.exe
b'.text\x00\x00\x00' MD5 hash: c7613102e2ecec5dcefc144f83189153
b'.text\x00\x00\x00' SHA256 hash: 7609ecc798a357dd1a2f0134f9a6ea06511a8885ec322c7acd0d84c569398678
b'.rdata\x00\x00' MD5 hash: d8037d744b539326c06e897625751cc9
b'.rdata\x00\x00' SHA256 hash: 532e9419f23eaf5eb0e8828b211a7164cbf80ad54461bc748c1ec2349552e6a2
b'.data\x00\x00\x00' MD5 hash: 22a8598dc29cad7078c291e94612ce26
b'.data\x00\x00\x00' SHA256 hash: 6f93fb1b241a990ecc281f9c782f0da471628f6068925aaf580c1b1de86bce8a
b'.rsrc\x00\x00\x00' MD5 hash: 12e1bd7375d82cca3a51ca48fe22d1a9
b'.rsrc\x00\x00\x00' SHA256 hash: 1efe677209c1284357ef0c7996a1318b7de3836dfb11f97d85335d6d3b8a8e42
String Analysis
In this phase, your objective is to extract strings (ASCII & Unicode) form a binary. Strings can furnish clues and valuable insight into the functionality of the malware. Occasionally, you can unearth unique embedded strings in a malware sample, such as:
- Embedded filenames
- IP addresses or domain names
- Registry paths or keys
- Windows API functions
- Command-line arguments
- Unique information that might hint at a particular threat actor
The Linux strings command can be deployed to display the strings contained within malware. For instance, the command below will reveal strings for a ransomware sample named dharma_sample.exe.
d41y@htb[/htb]$ strings -n 15 /home/htb-student/Samples/MalwareAnalysis/dharma_sample.exe
!This program cannot be run in DOS mode.
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@>@@@?456789:;<=@@@@@@@
!"#$%&'()*+,-./0123@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/
WaitForSingleObject
InitializeCriticalSectionAndSpinCount
LeaveCriticalSection
EnterCriticalSection
C:\crysis\Release\PDB\payload.pdb
0123456789ABCDEF
-n specifies to print a sequence of at least the number specified.
Occasionally, string analysis can facilitate the linkage of a malware sample to a specific threat group if significant similarities are identified. For example, in the link provided, a string containing a PDB path was used to link the malware sample to the Dharma/Crysis family of ransomware.
It should be noted that another string analysis solution exists called FLOSS. FLOSS, short for “FireEye Labs Obfuscated String Solver”, is a tool to automatically deobfuscate strings in malware. It’s designed to supplement the use of traditional string tools, like the strings command in Unix-based systems, which can miss obfuscated strings that are commonly used by malware to evade detection.
For instance, the command below will reveal strings for a ransomware sample named dharma_sample.exe.
d41y@htb[/htb]$ floss /home/htb-student/Samples/MalwareAnalysis/dharma_sample.exe
INFO: floss: extracting static strings...
finding decoding function features: 100%|███████████████████████████████████████| 238/238 [00:00<00:00, 838.37 functions/s, skipped 5 library functions (2%)]
INFO: floss.stackstrings: extracting stackstrings from 223 functions
INFO: floss.results: %sh(
extracting stackstrings: 100%|████████████████████████████████████████████████████████████████████████████████████| 223/223 [00:01<00:00, 133.51 functions/s]
INFO: floss.tightstrings: extracting tightstrings from 10 functions...
extracting tightstrings from function 0x4065e0: 100%|████████████████████████████████████████████████████████████████| 10/10 [00:01<00:00, 5.91 functions/s]
INFO: floss.string_decoder: decoding strings
INFO: floss.results: EEED
INFO: floss.results: EEEDnnn
INFO: floss.results: uOKm
INFO: floss.results: %sh(
INFO: floss.results: uBIA
INFO: floss.results: uBIA
INFO: floss.results: \t\t\t\t\t\t\t\t
emulating function 0x405840 (call 4/9): 100%|████████████████████████████████████████████████████████████████████████| 25/25 [00:11<00:00, 2.19 functions/s]
INFO: floss: finished execution after 23.56 seconds
FLARE FLOSS RESULTS (version v2.0.0-0-gdd9bea8)
+------------------------+------------------------------------------------------------------------------------+
| file path | /home/htb-student/Samples/MalwareAnalysis/dharma_sample.exe |
| extracted strings | |
| static strings | 720 |
| stack strings | 1 |
| tight strings | 0 |
| decoded strings | 7 |
+------------------------+------------------------------------------------------------------------------------+
------------------------------
| FLOSS STATIC STRINGS (720) |
------------------------------
-----------------------------
| FLOSS ASCII STRINGS (716) |
-----------------------------
!This program cannot be run in DOS mode.
Rich
.text
`.rdata
@.data
9A s
9A$v
A +B$
---SNIP---
+o*7
0123456789ABCDEF
------------------------------
| FLOSS UTF-16LE STRINGS (4) |
------------------------------
jjjj
%sh(
ssbss
0123456789ABCDEF
---------------------------
| FLOSS STACK STRINGS (1) |
---------------------------
%sh(
---------------------------
| FLOSS TIGHT STRINGS (0) |
---------------------------
-----------------------------
| FLOSS DECODED STRINGS (7) |
-----------------------------
EEED
EEEDnnn
uOKm
%sh(
uBIA
uBIA
\t\t\t\t\t\t\t\t
Unpacking UPX-packed Malware
In your static analysis, you might stumble upon a malware sample that’s been compressed or obfuscated using a technique referred to as packing. Packing serves several purposes:
- It obfuscates the code, making it more challenging to discern its structure or functionality.
- It reduces the size of the executable, making it quicker to transfer or less conspicuous.
- It confounds security researchers by hindering traditional reverse engineering attempts.
This can impar string analysis because the references to strings are typically obscured or eliminated. It also replaces or camouflages conventional PE sections with a compact loader stub, which retrieves the original code from a compressed data section. As a result, the malware file becomes both smaller and more difficult to analyze, as the original code isn’t directly observable.
A popular packer used in many malware variants is the Ultimate Packer for Executables (UPX).
First see what happens when you run the strings command on a UPX-packed malware sample named credential_stealer.exe.
d41y@htb[/htb]$ strings /home/htb-student/Samples/MalwareAnalysis/packed/credential_stealer.exe
!This program cannot be run in DOS mode.
UPX0
UPX1
UPX2
3.96
UPX!
8MZu
HcP<H
VDgxt
$ /uX
OAUATUWVSH
%0rv
o?H9
c`fG
[^_]A\A]
> -P
fo{Wnl
c9"^$!=
v/7>
07ZC
_L$AAl
mug.%(
#8%,X
e]'^
---SNIP---
Observe the strings that include “UPX”, and take note that the remainder of the output doesn’t yield any valuable information regarding the functionality of the malware.
You can unpack the malware using the UPX tool with the following command.
d41y@htb[/htb]$ upx -d -o unpacked_credential_stealer.exe credential_stealer.exe
Ultimate Packer for eXecutables
Copyright (C) 1996 - 2020
UPX 3.96 Markus Oberhumer, Laszlo Molnar & John Reiser Jan 23rd 2020
File size Ratio Format Name
-------------------- ------ ----------- -----------
16896 <- 8704 51.52% win64/pe unpacked_credential_stealer.exe
Unpacked 1 file.
Now run the strings command on the unpacked sample.
d41y@htb[/htb]$ strings unpacked_credential_stealer.exe
!This program cannot be run in DOS mode.
.text
P`.data
.rdata
`@.pdata
0@.xdata
0@.bss
.idata
.CRT
.tls
---SNIP---
AVAUATH
@A\A]A^
SeDebugPrivilege
SE Debug Privilege is adjusted
lsass.exe
Searching lsass PID
Lsass PID is: %lu
Error is - %lu
lsassmem.dmp
LSASS Memory is dumped successfully
Err 2: %lu
Unknown error
Argument domain error (DOMAIN)
Overflow range error (OVERFLOW)
Partial loss of significance (PLOSS)
Total loss of significance (TLOSS)
The result is too small to be represented (UNDERFLOW)
Argument singularity (SIGN)
_matherr(): %s in %s(%g, %g) (retval=%g)
Mingw-w64 runtime failure:
Address %p has no image-section
VirtualQuery failed for %d bytes at address %p
VirtualProtect failed with code 0x%x
Unknown pseudo relocation protocol version %d.
Unknown pseudo relocation bit size %d.
.pdata
AdjustTokenPrivileges
LookupPrivilegeValueA
OpenProcessToken
MiniDumpWriteDump
CloseHandle
CreateFileA
CreateToolhelp32Snapshot
DeleteCriticalSection
EnterCriticalSection
GetCurrentProcess
GetCurrentProcessId
GetCurrentThreadId
GetLastError
GetStartupInfoA
GetSystemTimeAsFileTime
GetTickCount
InitializeCriticalSection
LeaveCriticalSection
OpenProcess
Process32First
Process32Next
QueryPerformanceCounter
RtlAddFunctionTable
RtlCaptureContext
RtlLookupFunctionEntry
RtlVirtualUnwind
SetUnhandledExceptionFilter
Sleep
TerminateProcess
TlsGetValue
UnhandledExceptionFilter
VirtualProtect
VirtualQuery
__C_specific_handler
__getmainargs
__initenv
__iob_func
__lconv_init
__set_app_type
__setusermatherr
_acmdln
_amsg_exit
_cexit
_fmode
_initterm
_onexit
abort
calloc
exit
fprintf
free
fwrite
malloc
memcpy
printf
puts
signal
strcmp
strlen
strncmp
vfprintf
ADVAPI32.dll
dbghelp.dll
KERNEL32.DLL
msvcrt.dll
Now, you observe a more comprehensive output that includes the actual strings present in the sample.
Static Analysis - Windows
Identifying the File Type
Your first port of call in this stage is to ascertain the rudimentary information about the malware specimen to lay the groundwork for your investigation. Given that file extensions can be manipulated and changed, your task is to devise a method to identify the actual file type you are encountering. Establishing the file type plays an integral role in static analysis, ensuring that the procedures you apply are appropriate and the results obtained are accurate.
You can use a solution like CFF Exlorer to check the file type of malware as follows.

On a Windows system, the presence of the ASCII string MZ at the start of a file denotes an executable file. MZ stands for Mark Zbikowski, a key architect of MS-DOS.
Malware Fingerprinting
In this stage, your mission is to create a unique identifier for the malware sample. This typically takes the form of a cryptographic hash - MD5, SHA1, or SHA256.
Fingerprinting is employed for numerous purposes, encompassing:
- Identification and tracking of malware samples
- Scanning an entire system for the presence of identical malware
- Confirmation of previous encounters and analyses of the same malware
- Sharing with stakeholders as IoC or as part of threat intelligence reports
As an illustration, to check the MD5 file hash of the abovementioned malware can use the Get-FileHash PowerShell cmdlet as follows.
PS C:\Users\htb-student> Get-FileHash -Algorithm MD5 C:\Samples\MalwareAnalysis\Ransomware.wannacry.exe
Algorithm Hash Path
--------- ---- ----
MD5 DB349B97C37D22F5EA1D1841E3C89EB4 C:\Samples\MalwareAnalysis\Ra...
To check the SHA256 file hash of the abovementioned malware the command would be the following.
PS C:\Users\htb-student> Get-FileHash -Algorithm SHA256 C:\Samples\MalwareAnalysis\Ransomware.wannacry.exe
Algorithm Hash Path
--------- ---- ----
SHA256 24D004A104D4D54034DBCFFC2A4B19A11F39008A575AA614EA04703480B1022C C:\Samples\MalwareAnalysis\Ra...
File Hash Lookup
The ensuing step involves checking the file hash produced in the prior step against online malware scanners and sandboxes such as Cuckoo sandbox. For instance, VirusTotal, an online malware scanning engine,which collaborates with various AV vendors, allows you to search for the file hash. This step aids you in comparing your results with existing knowledge about the malware sample.
The following image displays the results from VirusTotal after the SHA256 file hash of the abovementioned malware was submitted.

Even though a file hash like MD5, SHA1, or SHA256 is valuable for identifying identical samples with disparate names, it falls short when identifying similar malware samples. This is primarily because a malware author can alter the file hash value by making minor modifications to the code and recompiling it.
Nonetheless, there exist techniques that can aid in identifying similar samples.:
IMPHASH
… is cryptographic hash calculated from the import functions of a Windows PE file. Its algorithm functions by first converting all imported function names to lowercase. Following this, the DLL names and function names are fused together and arranged in alphabetical order. Finally, an MD5 hash is generated from the resulting string. Therefore, two PE files with identical import functions, in the same sequence, will share an IMPHASH value.
You can find the IMPHASH in the “Details” tab of the VirusTotal results.

Note that you can also use the pefile Python module to compute the IMPHASH of a file as follows.
import sys
import pefile
import peutils
pe_file = sys.argv[1]
pe = pefile.PE(pe_file)
imphash = pe.get_imphash()
print(imphash)
To check the IMPHASH of the abovementioned WannaCry malware the command would be the following. imphash_calc.py contains the Python code above.
C:\Scripts> python imphash_calc.py C:\Samples\MalwareAnalysis\Ransomware.wannacry.exe
9ecee117164e0b870a53dd187cdd7174
SSDEEP
… is a hashing technique designed to compute a hash value indicative of content similarity between two files. This technique dissects a file into smaller, fixed-size blocks and calculates a hash for each block. The resulting hash values are then consolidated to generate the final fuzzy hash.
The SSDEEP algorithm allocates more weight to longer sequences of common blocks, making it highly effective in identifying files that have undergone minor modifications, or are similar but not identical, such as different variations of a malicious script.
You can find the SSDEEP hash of a malware in the “Details” tab of the VirusTotal results.
You can also use the ssdeep tool to calculate the SSDEEP hash of a file. To check the SSDEEP hash of the abovementioned WannaCry malware the command would be the following.
C:\Tools\ssdeep-2.14.1> ssdeep.exe C:\Samples\MalwareAnalysis\Ransomware.wannacry.exe
ssdeep,1.1--blocksize:hash:hash,filename
98304:wDqPoBhz1aRxcSUDk36SAEdhvxWa9P593R8yAVp2g3R:wDqPe1Cxcxk3ZAEUadzR8yc4gB,"C:\Samples\MalwareAnalysis\Ransomware.wannacry.exe"

Hashing PE Sections
… is a powerful technique that allows analysts to identify sections of a PE file that have been modified. This can be particularly useful for identifying minor variations in malware samples, a common tactic employed by attackers to evade detection.
The Section Hashing technique works by calculating the cryptographic hash of each of these sections. When comparing two PE files, if the hash corresponding sections in the two files matches, it suggests that the particular section has not been modified between the two versions of the file.
By applying section hashing, security analysts can identify parts of a PE file that have been tampered with or altered. This can help identify similar malware samples, even if they have been slightly modified to evade traditional signature-based detection methods.
Tools such as pefile in Python can be used to perform section hashing. In Python, for example, you can use the pefile module to access and hash the data in individual sections of a PE file as follows.
import sys
import pefile
pe_file = sys.argv[1]
pe = pefile.PE(pe_file)
for section in pe.sections:
print (section.Name, "MD5 hash:", section.get_hash_md5())
print (section.Name, "SHA256 hash:", section.get_hash_sha256())
Remember that while section hashing is a powerful technique, it is not foolproof. Malware authors might employ tactics like section name obfuscation or dynamically generating section names to try and bypass this kind of analysis.
As an illustration, to check the MD5 file hash of the abovementioned malware you can use pestudio as follows.

String Analysis
In this phase, your objective is to extract strings from a binary. Strings can furnish clues and valuable insight into the functionality of the malware. Occasionally, you can unearth unique embedded strings in a malware sample, such as:
- Embedded filenames
- IP addresses or domain names
- Registry paths or keys
- Windows API functions
- Command-line arguments
- Unique information that might hint at a particular threat actor
The Windows strings binary from Sysinternals can be deployed to display the strings contained within a malware. For instance, the command below will reveal strings for a ransomware sample named dharma_sample.exe.
C:\Users\htb-student> strings C:\Samples\MalwareAnalysis\dharma_sample.exe
Strings v2.54 - Search for ANSI and Unicode strings in binary images.
Copyright (C) 1999-2021 Mark Russinovich
Sysinternals - www.sysinternals.com
!This program cannot be run in DOS mode.
gaT
Rich
.text
`.rdata
@.data
HQh
9A s
9A$v
---SNIP---
GetProcAddress
LoadLibraryA
WaitForSingleObject
InitializeCriticalSectionAndSpinCount
LeaveCriticalSection
GetLastError
EnterCriticalSection
ReleaseMutex
CloseHandle
KERNEL32.dll
RSDS%~m
#ka
C:\crysis\Release\PDB\payload.pdb
---SNIP---
Occasionally, string analysis can facilitate the linkage of a malware sample to a specific threat group if significant similarities are identified. For example, in the link provided, a string containing a PDB path was used to link the malware sample to the Dharma/Crysis family of ransomware.
It should be noted that the FLOSS tool is also available for Windows OS.
The command below will reveal strings for a malware sample named shell.exe.
C:\Samples\MalwareAnalysis> floss shell.exe
INFO: floss: extracting static strings...
finding decoding function features: 100%|████████████████████████████████████████████| 85/85 [00:00<00:00, 1361.51 functions/s, skipped 0 library functions]
INFO: floss.stackstrings: extracting stackstrings from 56 functions
INFO: floss.results: AQAPRQVH1
INFO: floss.results: JJM1
INFO: floss.results: RAQH
INFO: floss.results: AXAX^YZAXAYAZH
INFO: floss.results: XAYZH
INFO: floss.results: ws232
extracting stackstrings: 100%|██████████████████████████████████████████████████████████████████████████████████████| 56/56 [00:00<00:00, 81.46 functions/s]
INFO: floss.tightstrings: extracting tightstrings from 4 functions...
extracting tightstrings from function 0x402a90: 100%|█████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 25.59 functions/s]
INFO: floss.string_decoder: decoding strings
emulating function 0x402a90 (call 1/1): 100%|███████████████████████████████████████████████████████████████████████| 22/22 [00:14<00:00, 1.51 functions/s]
INFO: floss: finished execution after 25.20 seconds
FLARE FLOSS RESULTS (version v2.3.0-0-g037fc4b)
+------------------------+------------------------------------------------------------------------------------+
| file path | shell.exe |
| extracted strings | |
| static strings | 254 |
| stack strings | 6 |
| tight strings | 0 |
| decoded strings | 0 |
+------------------------+------------------------------------------------------------------------------------+
──────────────────────
FLOSS STATIC STRINGS
──────────────────────
+-----------------------------------+
| FLOSS STATIC STRINGS: ASCII (254) |
+-----------------------------------+
!This program cannot be run in DOS mode.
.text
P`.data
.rdata
`@.pdata
0@.xdata
0@.bss
.idata
.CRT
.tls
8MZu
HcP<H
D$ H
AUATUWVSH
D$ L
---SNIP---
C:\Windows\System32\notepad.exe
Message
Connection sent to C2
[-] Error code is : %lu
AQAPRQVH1
JJM1
RAQH
AXAX^YZAXAYAZH
XAYZH
ws2_32
PPM1
APAPH
WWWM1
VPAPAPAPI
Windows-Update/7.6.7600.256 %s
1Lbcfr7sAHTD9CgdQo3HTMTkV8LK4ZnX71
open
SOFTWARE\Microsoft\Windows\CurrentVersion\Run
WindowsUpdater
---SNIP---
TEMP
svchost.exe
%s\%s
http://ms-windows-update.com/svchost.exe
45.33.32.156
Sandbox detected
iuqerfsodp9ifjaposdfjhgosurijfaewrwergwea.com
SOFTWARE\VMware, Inc.\VMware Tools
InstallPath
C:\Program Files\VMware\VMware Tools\
Failed to open the registry key.
Unknown error
Argument domain error (DOMAIN)
Overflow range error (OVERFLOW)
Partial loss of significance (PLOSS)
Total loss of significance (TLOSS)
The result is too small to be represented (UNDERFLOW)
Argument singularity (SIGN)
_matherr(): %s in %s(%g, %g) (retval=%g)
Mingw-w64 runtime failure:
Address %p has no image-section
VirtualQuery failed for %d bytes at address %p
VirtualProtect failed with code 0x%x
Unknown pseudo relocation protocol version %d.
Unknown pseudo relocation bit size %d.
.pdata
RegCloseKey
RegOpenKeyExA
RegQueryValueExA
RegSetValueExA
CloseHandle
CreateFileA
CreateProcessA
CreateRemoteThread
DeleteCriticalSection
EnterCriticalSection
GetComputerNameA
GetCurrentProcess
GetCurrentProcessId
GetCurrentThreadId
GetLastError
GetStartupInfoA
GetSystemTimeAsFileTime
GetTickCount
InitializeCriticalSection
LeaveCriticalSection
OpenProcess
QueryPerformanceCounter
RtlAddFunctionTable
RtlCaptureContext
RtlLookupFunctionEntry
RtlVirtualUnwind
SetUnhandledExceptionFilter
Sleep
TerminateProcess
TlsGetValue
UnhandledExceptionFilter
VirtualAllocEx
VirtualProtect
VirtualQuery
WriteFile
WriteProcessMemory
__C_specific_handler
__getmainargs
__initenv
__iob_func
__lconv_init
__set_app_type
__setusermatherr
_acmdln
_amsg_exit
_cexit
_fmode
_initterm
_onexit
_vsnprintf
abort
calloc
exit
fprintf
free
fwrite
getenv
malloc
memcpy
printf
puts
signal
sprintf
strcmp
strlen
strncmp
vfprintf
ShellExecuteA
MessageBoxA
InternetCloseHandle
InternetOpenA
InternetOpenUrlA
InternetReadFile
WSACleanup
WSAStartup
closesocket
connect
freeaddrinfo
getaddrinfo
htons
inet_addr
socket
ADVAPI32.dll
KERNEL32.dll
msvcrt.dll
SHELL32.dll
USER32.dll
WININET.dll
WS2_32.dll
+------------------------------------+
| FLOSS STATIC STRINGS: UTF-16LE (0) |
+------------------------------------+
─────────────────────
FLOSS STACK STRINGS
─────────────────────
AQAPRQVH1
JJM1
RAQH
AXAX^YZAXAYAZH
XAYZH
ws232
─────────────────────
FLOSS TIGHT STRINGS
─────────────────────
───────────────────────
FLOSS DECODED STRINGS
───────────────────────
Unpacking UPX-packed Malware
In your static analysis, you might stumble upon a malware sample that’s been compressed or obfuscated using a technique referred to as packing. Packing serves several purposes:
- It obfuscates the code, making it more challenging to discern its structure or functionality.
- It reduces the size of the executable, making it quicker to transfer or less conspicuous.
- It confounds security researchers by hindering traditional reverse engineering attempts.
This can impair string analysis because the references to strings are typically obscured or eliminated. It also replaces or camouflages conventional PE sections with a compact loader stub, which retrieves the original code from a compressed data section. As a result, the malware file becomes both smaller and more difficult to analyze, as the original code isn’t observable.
A popular packer used in many malware variants is the Ultimate Packer for Executables (UPX).
First see what happens when you run the strings command on a UPX-packed malware sample named credential_stealer.exe.
C:\Users\htb-student> strings C:\Samples\MalwareAnalysis\packed\credential_stealer.exe
Strings v2.54 - Search for ANSI and Unicode strings in binary images.
Copyright (C) 1999-2021 Mark Russinovich
Sysinternals - www.sysinternals.com
!This program cannot be run in DOS mode.
UPX0
UPX1
UPX2
3.96
UPX!
ff.
8MZu
HcP<H
tY)
L~o
tK1
7c0
VDgxt
amE
8#v
$ /uX
OAUATUWVSH
Z6L
<=h
%0rv
o?H9
7sk
3H{
HZu
'.}
c|/
c`fG
Iq%
[^_]A\A]
> -P
fo{Wnl
c9"^$!=
;\V
%&m
')A
v/7>
07ZC
_L$AAl
mug.%(
t%n
#8%,X
e]'^
(hk
Dks
zC:
Vj<
w~5
m<6
|$PD
c(t
\3_
---SNIP---
Observe the strings that inlcude UPX, and take note that the remainder of the output doens’t yield any valuable information regarding the functionality of the malware.
You can unpack the malware using the UPX tool with the following command.
C:\Tools\upx\upx-4.0.2-win64> upx -d -o unpacked_credential_stealer.exe C:\Samples\MalwareAnalysis\packed\credential_stealer.exe
Ultimate Packer for eXecutables
Copyright (C) 1996 - 2023
UPX 4.0.2 Markus Oberhumer, Laszlo Molnar & John Reiser Jan 30th 2023
File size Ratio Format Name
-------------------- ------ ----------- -----------
16896 <- 8704 51.52% win64/pe unpacked_credential_stealer.exe
Unpacked 1 file.
Now run the strings command on the unpacked sample.
C:\Tools\upx\upx-4.0.2-win64> strings unpacked_credential_stealer.exe
Strings v2.54 - Search for ANSI and Unicode strings in binary images.
Copyright (C) 1999-2021 Mark Russinovich
Sysinternals - www.sysinternals.com
!This program cannot be run in DOS mode.
.text
P`.data
.rdata
`@.pdata
0@.xdata
0@.bss
.idata
.CRT
.tls
ff.
8MZu
HcP<H
---SNIP---
D$(
D$
D$0
D$(
D$
t'H
%5T
@A\A]A^
SeDebugPrivilege
SE Debug Privilege is adjusted
lsass.exe
Searching lsass PID
Lsass PID is: %lu
Error is - %lu
lsassmem.dmp
LSASS Memory is dumped successfully
Err 2: %lu
@u@
`p@
Unknown error
Argument domain error (DOMAIN)
Overflow range error (OVERFLOW)
Partial loss of significance (PLOSS)
Total loss of significance (TLOSS)
The result is too small to be represented (UNDERFLOW)
Argument singularity (SIGN)
_matherr(): %s in %s(%g, %g) (retval=%g)
Mingw-w64 runtime failure:
Address %p has no image-section
VirtualQuery failed for %d bytes at address %p
VirtualProtect failed with code 0x%x
Unknown pseudo relocation protocol version %d.
Unknown pseudo relocation bit size %d.
.pdata
0@
00@
`E@
`E@
@v@
hy@
`y@
@p@
0v@
Pp@
AdjustTokenPrivileges
LookupPrivilegeValueA
OpenProcessToken
MiniDumpWriteDump
CloseHandle
CreateFileA
CreateToolhelp32Snapshot
DeleteCriticalSection
EnterCriticalSection
GetCurrentProcess
GetCurrentProcessId
GetCurrentThreadId
GetLastError
GetStartupInfoA
GetSystemTimeAsFileTime
GetTickCount
InitializeCriticalSection
LeaveCriticalSection
OpenProcess
Process32First
Process32Next
QueryPerformanceCounter
RtlAddFunctionTable
RtlCaptureContext
RtlLookupFunctionEntry
RtlVirtualUnwind
SetUnhandledExceptionFilter
Sleep
TerminateProcess
TlsGetValue
UnhandledExceptionFilter
VirtualProtect
VirtualQuery
__C_specific_handler
__getmainargs
__initenv
__iob_func
__lconv_init
__set_app_type
__setusermatherr
_acmdln
_amsg_exit
_cexit
_fmode
_initterm
_onexit
abort
calloc
exit
fprintf
free
fwrite
malloc
memcpy
printf
puts
signal
strcmp
strlen
strncmp
vfprintf
ADVAPI32.dll
dbghelp.dll
KERNEL32.DLL
msvcrt.dll
Now, you observe a more comprehensible output that includes the actual strings present in the sample.
Dynamic Analysis
In dynamic analysis, you observe and interpret the bahavior of the malware while it is running, or in action. This is a critical contrast to static analysis, where you dissect the malware’s properties and contents without executing it. The primary goal of dynamic analysis is to document and understand the real-world impact of the malware on its host environment, making it an integral part of comprehensive malware analysis.
In executing dynamic analysis, you encapsulate the malware within a tightly controlled, monitored, and usually isolated environment to prevent any unintentional spread or damage. This environment is typically a VM to which the malware is oblivious. It believes it is interacting with a genuine system, while you have full control over its interactions and can document its behavior thoroughly.
Your dynamic analysis procedure can be broken into the following steps:
- Environment Setup: You first establish a secure and controlled environment, typically a VM, isolated from the rest of the network to prevent inadvertent contamination or propagation of the malware. The VM setup should mimic a real-world system, complete with software, applications, and network configs, that an actual user might have.
- Baseline Capture: After the environment is set up, you capture a snapshot of the system’s clean state. This includes system files, registry states, running processes, network configuration, and more. This baseline serves as a reference point to identify changes by the malware post-execution.
- Tool Deployment (Pre-Execution): To capture the activities of the malware effectively, you deploy various monitoring and logging tools. Tools such as Process Monitor (Procmon) from Sysinternals Suite are used to log system calls, file system activity, registry operations, etc. You can also employ utilities like Wireshark, tcpdump, and Fiddler for capturing network traffic, and Regshot to take before-and-after snapshots of the system registry. Finally, tools such as INetSim, FakeDNS, and FakeNet-NG are used to simulate internet services.
- Malware Execution: With your tools running and ready, you proceed to execute the malware sample in the isolated environment. During execution, the monitoring tools capture and log all activities, including process creation, file and registry modifications, network traffic, etc.
- Observation and Logging: The malware sample is allowed to execute for a sufficient duration. All the while, your monitoring tools are diligently recording its every move, which will provide you with comprehensive insight into its behavior and modus operandi.
- Analysis of Collected Data: After the malware has run its course, you halt its execution and stop the monitoring tools. You now examine the logs and data collected, comparing the system’s state to your initial baseline to identify the changes introduced by the malware.
In some cases, when the malware is particularly evasive or complex, you might employ sandboxed environments for dynamic analysis. Sandboxes, such as Cuckoo Sandbox, Joe Sandbox, or FireEye’s Dynamic Threat Intelligence cloud, provide an automated, safe, and highly controlled environment for malware execution. They come equipped with numerous features for in-depth behavioral analysis and generate detailed reports regarding the malware’s network behavior, file system interaction, memory footprint, and more.
However, it’s important to remember that while sandbox environments are valuable tools, they are not foolproof. Some advanced malware can detect sandbox environments and alter their behavior accordingly, making it harder for researchers to ascertain their true nature.
Dynamic Analysis with Noriben
Noriben is a powerful tool in your dynamic analysis toolkit, essentially acting as a Python wrapper for Sysinternals ProcMon, a comprehensive system monitoring utility. It orchestrates the operation of ProcMon, refines the output, and adds a layer of malware-specific intelligence to the process. Leveraging Noriben, you can capture malware behaviors more conveniently and understand them more precisely.
To understand how Noriben empowers your dynamic analysis efforts, first quickly review ProcMon. This tool, from Sysinternals Suite, monitors real-time file system, Registry, and process/thread activity. It combines the features of utilities like Filemon, Regmon, and advanced features like filtering, advanced highlighting, and extensive event properties, making it a powerful system monitoring tool for malware analysis.
However, the volume and breadth of information that Procmon collects can be overwhelming. Without filtering and contextual analysis, sifting through this raw data becomes a considerable challenge. This is where Noriben steps in. It uses Procmon to capture system events but then filters and analyzes this data to extract meaningful information and pinpoint malicious activities.
In you dynamic malware analysis process, here’s how you employ Noriben:
- Setting Up Noriben: You initiate Noriben by launching it from the command line. The tool supports numerous command-line arguments to customize its operation. For instance, you can define the duration of data collection, specify a custom malware sample for execution, or select a personalized ProcMon configuration file.
- Launching ProcMon: Upon initiation, Noriben start ProcMon with a predefined configuration. This configuration contains a set of filters designed to exclude normal system activity and focus on potential indicators of malicious actions.
- Executing the Malware Sample: With ProcMon running, Noriben executes the selected malware sample. During this phase, ProcMon captures all system activities, including process operations, file system changes, and registry modifications.
- Monitoring and Loggin: Noriben controls the duration of monitoring, and once it concludes, it commands ProcMon to save the collected data to a CSV file and then terminates ProcMon.
- Data Analysis and Reporting: This is where Noriben shines. It processes the CSV file generated by ProcMon, applying additional filters and performing contextual analysis. Noriben identifies potentially suspicious activities and organizes them into different categories, such as file system activity, process operations, and network connections. This process results in a clear, readable report in HTML or TXT format, highlighting the behavioral traits of the analyzed malware.
Noriben’s integration with YARA rules is another notable feature. You can leverage YARA rules to enhance your data filtering capabilities, allowing you to identify patterns of interest more efficiently.
Demonstration
For demonstration purposes, you conduct dynamic analysis on a malware named shell.exe.
- Launch a new Command Line interface
- Initiate Noriben as indicated
C:\Tools\Noriben-master> python .\Noriben.py
[*] Using filter file: ProcmonConfiguration.PMC
[*] Using procmon EXE: C:\ProgramData\chocolatey\bin\procmon.exe
[*] Procmon session saved to: Noriben_27_Jul_23__23_40_319983.pml
[*] Launching Procmon ...
[*] Procmon is running. Run your executable now.
[*] When runtime is complete, press CTRL+C to stop logging.
- Upon seein the User Account Control prompt, select “Yes”
- Proceed to the malware directory and activate shell.exe
- shell.exe will identify it is running within a sandbox; close the window when it created
- Terminate ProcMon
- In the Command Prompt running Noriben, use the
[Ctrl+C]command to cease its operation
C:\Tools\Noriben-master> python .\Noriben.py
[*] Using filter file: ProcmonConfiguration.PMC
[*] Using procmon EXE: C:\ProgramData\chocolatey\bin\procmon.exe
[*] Procmon session saved to: Noriben_27_Jul_23__23_40_319983.pml
[*] Launching Procmon ...
[*] Procmon is running. Run your executable now.
[*] When runtime is complete, press CTRL+C to stop logging.
[*] Termination of Procmon commencing... please wait
[*] Procmon terminated
[*] Saving report to: Noriben_27_Jul_23__23_42_335666.txt
[*] Saving timeline to: Noriben_27_Jul_23__23_42_335666_timeline.csv
[*] Exiting with error code: 0: Normal exit
You’ll observe that Noriben generates a .txt report inside it’s directory, compiling all the behavioral information it managed to gather.

Noriben uses ProcMon to capture system events but then filters and analyzes this data to extract meaningful information and pinpoint malicious activities.
Noriben might filter out some potentially valuable information. For instance, you don’t receive any insightful data from Noriben’s report about how shell.exe recognized that is was functioning within a sandbox or VM.
Take a different approach and manually launch ProcMon using its default, more inclusive, configuration. Following this, re-run shell.exe. This might give you insights into how shell.exe detects the presence of a sandbox or VM.
Then, configure the filer as follows and press “Apply”.

Finally, navigate to the end of the results. There you can observe that shell.exe conducts sandbox or VM detection by querying the registry for the presence of VMware tools.

Code Analysis
Code Analysis
Reverse Engineering & Code Analysis
Reverse engineering is a process that takes you beneath the surface of executable files or compiled machine code, enabling you to decode their functionality, behavioral traits, and structure. With the absence of source code, you turn to the analysis of disassembled code instructions, also known as assembly code analysis. This deeper level of understanding helps you to uncover obscured or elusive functionalities that remain hidden even after static and dynamic analysis.
To untangle the complex web of machine code, you turn to a duo of powerful tools.: Disassemblers and Debuggers.
- A Disassembler is your tool of choice when you wish to conduct a static analysis of the code, meaning that you need not execute the code. This type of analysis is invaluable as it helps you to understand the structure and logic of the code without activating potentially harmful functionalities. Some prime examples of diassemblers include IDA, Cutter, and Ghidra.
- A Debugger, on the other hand, serves a dual purpose. Like a disassembler, it decodes machine code into assembly instructions. Additionally, it allows you to execute code in a controlled manner, proceeding instruction by instruction, skipping to specific locations, or halting the execution flow at designated points during breakpoints. Examples of debuggers include x32dgb, x64dgb, IDA, and OllyDbg.
Take a step back and understand the challenge before you. The journy of code from human-readable high-level languages, such as C or C++, to machine code is a one-way ticket, guided by the compiler. Machine code, a binary language that computers process directly, is a cryptic narrative for human analysts. Here’s where the assembly language comes into play, acting as a bridge between you and the machine code, enabling you to decode the latter’s story.
A disassembler transforms machine code back into assembly language, presenting you with a readable sequence of instructions. Understanding assembly and its mnemonics is pivotal in dissecting the functionality of malware.
Code analysis is the process of scrutinizing and deciphering the behavior and functionality of a compiled program or binary. This involves analyzing the instructions, control flow, and data structures within the code, ultimately shedding light on the purpose, functionality, and potential indicators of compromise.
Understanding a program or a piece of malware often requires you to reverse the compilation process. This is where Disassembly comes into the picture. By converting machine code back into assembly language instructions, you end up with a set of instructions that are symbolic and mnemonic, enabling you to decode the logic and workings of the program.

Disassemblers are you allies in this process. These specialized tools take the binary code, generate the corresponding assembly instructions, and often supplement them with additional context such as memory address, function names, and control flow analysis. One such powerful tool is IDA, a widely used disassembler and debugger revered for its advanced analysis features. It supports multiple executable file formats and architectures, presenting a comprehensive disassembly view and potent analysis capabilities.
Code Analysis Example: shell.exe
Persist with the analysis of the shell.exe malware. Up until this point, you’ve discovered that it conducts sandbox detection, and that it includes a possible sleep mechanism - a 5-second ping delay - before executing its intended operations.
Importing a Malware Sample into the Disassembler - IDA
For the next stage in your investigation, you must scrutinize the code in IDA to ascertain its further actions and discover how to circumvent the sandbox check employed by the malware sample.
You can initiate IDA either by double-clicking the IDA shortcut or by right-clicking it and selecting Run as administrator to ensure proper access rights. At first, it will display the license information and subsequently prompt you to open a new executable for analysis.
Next, op for New and select the shell.exe sample.

The Load a new file dialog box that pops up next is where you can select the processor architecture. Choose the correct one and click OK. By default, IDA determines the appropriate processor type.

After you hit OK, IDA will load the executable file into memory and disassembles the machine code to render the disassembled output for you. The screenshot below illustrates the different views in IDA.

Once the executable is loaded and the analysis completes, the disassembled code of the sample shell.exe will be exhibited in the main IDA-View window. You can traverse through the ode using the cursor keys or scroll bar and zoom in or out using the mouse wheel or the zoom controls.
Text and Graph Views
The disassembled code is presented in two modes, namely the Graph View and the Text View. The default view is the Graph View, which provides a graphic illustration of the function’s basic blocks and their interconnections. Basic blocks are instruction sequences with a single entry and exit point. These basic blocks are symbolized as nodes in the graph view with the connections between them as edges.
To toggle between the graph and text views, simply press the spacebar button.
- The Graph View offers a pictorial representation of the program’s control flow, facilitating a better understanding of execution flow, indentification of loops, conditionals, and jumps, and a visualization of how the program branches or cycles through different code paths. The functions are displayed as nodes in the Graph View. Each function is depicted as a distinct node with a unique identifier and additional details such as the function name, address, and size.

- The Text view displays the assembly instructions along with their corresponding memory address. Each line in the Text view represents an instruction or a data element in the code, beginning with the section
name:virtual addressformat (for example,.text:00000000004014F0, where the section name is.textand the virtual address is00000000004014F0).
text:00000000004014F0 ; =============== S U B R O U T I N E =======================================
text:00000000004014F0
text:00000000004014F0
text:00000000004014F0 public start
text:00000000004014F0 start proc near ; DATA XREF: .pdata:000000000040603C↓o
text:00000000004014F0
text:00000000004014F0 ; FUNCTION CHUNK AT .text:00000000004022A0 SIZE 000001B0 BYTES
text:00000000004014F0
text:00000000004014F0 ; __unwind { // __C_specific_handler
text:00000000004014F0 sub rsp, 28h
text:00000000004014F4
text:00000000004014F4 loc_4014F4: ; DATA XREF: .xdata:0000000000407058↓o
text:00000000004014F4 ; __try { // __except at loc_40150C
text:00000000004014F4 mov rax, cs:off_405850
text:00000000004014FB mov dword ptr [rax], 0
text:0000000000401501 call sub_401650
text:0000000000401506 call sub_401180
text:000000000040150B nop
text:000000000040150B ; } // starts at 4014F4

IDA’s Text view employs arrows to signify different types of control flow instructions and jumps. Here are some commonly seen arrows and their interpretations:
- Solid Arrow (->): A solid arrow denotes a direct jump or branch instruction, indicating an unconditional shift in the program’s flow where execution moves from one location to another. This occurs when a jump or branch instruction like
jmporcallis encountered. - Dashed Arrow (—>): A dashed arrow represents a conditional jump or branch instruction, suggesting that the program’s flow might change based on a specific condition. The destination of the jump depends on the condition’s outcome. For instance, a
jzinstruction will trigger a jump only if a previous comparison yielded a zero value.

By default, IDA initially exhibits the main function or the function a the program’s designated entry point. However, you have the liverty to explore and examine other functions in the graph view.