Exploring Qakbot #02: Decoding Qakbot DLL Stager: A Reverse Engineering Analysis
Last updated
Last updated
In the labyrinthine world of malware analysis, the Qakbot banking Trojan stands as a perennial challenge for cybersecurity experts. A recent addition to its arsenal is the "DLL Stager," a clandestine component that plays a pivotal role in the malware's operation.
This exploration delves into the enigmatic domain of the Qakbot DLL Stager from a reverse engineering perspective. We embark on a journey to unravel its inner workings, understand its purpose, and dissect the techniques it employs to execute its malevolent mission.
The Qakbot DLL Stager is a critical element in the malware's ability to infiltrate and compromise systems. It serves as the gateway for deploying additional payloads and enabling the broader scope of Qakbot's malicious objectives.
Join us as we navigate the intricacies of the Qakbot DLL Stager, decrypting its actions and uncovering its role in the broader landscape of cyber threats. This analysis not only sheds light on the specific workings of this component but also contributes to the broader understanding of malware tactics, empowering defenders to better protect their digital domains.
As we delve into the heart of this reverse engineering endeavor, be prepared to unearth the concealed strategies employed by Qakbot and, in doing so, enhance your expertise in the ever-evolving field of cybersecurity.
This first stage starts checking for DLL_PROCESS_ATTACH.
If this is the case, it will proceed to do its first initialization procedures, which include:
Create a heap object which will be used later for allocations of dynamic structs or strings.
Initialize a character set global string, which will be used for string manipulation.
Copy the Import directory table from the DLL module, to a new global buffer (array of IMAGE_IMPORT_DESCRIPTOR references through DataDirectory in the Optional Header).
Resolve with API hashing a dynamic kernel32.dll function table, with a different number of APIs used during the execution flow.
Check with GetFileAttributes, for "C:\INTERNAL\__empty". This has been described as a way to check for Windows Defender emulation.
After all these steps have been done successfully, Qakbot will use the newly created kernel32 IAT to call CreateThread with the main malicious thread.
It is important to clarify that Qakbot also uses string decryption for both WCHAR and CHAR, which use two wrappers each.
As of May/June 2022, one of the new functions that has been added is the check for an environment variable SELF_TEST1. This serves as a flag for doing specific checks in terms of the path of the module itself, which seems to be mainly for debugging purposes, since it pops up a MessageBox with "Self Check", "Self Check ok!" and "ERROR: GetModuleFileNameW() failed with error: 0".
In this sense, the debug string describes the core functionality, which is a custom GetModuleFilenameW() implemented using CommandLineToArgvW.
fter successful checks, the main malicious thread will call the same API hashing function that rebuilds a new IAT for more modules including: ntdll.dll, user32.dll, netapi32.dll, advapi32.dll, shlwapi.dll, userenv.dll, and ws2_32.dll
(This last module ws2_32.dll has been recently added in February/March 2022, to replace inet_ntoa which was in the imports of the DLL so you could just use xrefs to easily reverse the structure that stored both the C2 IPs and ports, more of that at the end of the paper, in additional details.)
Then it will proceed to build a structure based on the information from the computer, where all types of information are initialized in the structure members for different purposes, like random names for mutexes or events.
Some of the information includes: Computer Name, Volume information, Primary Domain Controller Name, Privileges, Module name, Type of Process (WOW64 or x64 binary), TickCount, Os Version Info and much more.
Additionally, it sets up environment variables as needed for USERPROFILE, TEMP and SystemDrive.
In the final part from this structure constructor, there will be a check for certain processes as a detection mechanism, mainly done through an array of structures where each element will store a value that will decrypt the process names strings separated by ';' character, using the WCHAR version of this function.
For example: "explorer.exe;notepad.exe;cmd.exe", each process names separated by ';' one of them will be stored as a char* inside the ptrToProcessNames char** (char*[ ])
The value in dwBitMaskDetection inside the ContainerProcessDetection struct will serve as a bitmask, where each time one of the processes is detected, it will be ORd in this member with the current ProcessDetection member dwBitmask.
All the processes detected this way include:
ccSvcHst.exe, avgcsrvx.exe;avgsvcx.exe;avgcsrva.exe, MsMpEng.exe ,mcshield.exe, avp.exe;kavtray.exe ,egui.exe;ekrn.exe,bdagent.exe;vsserv.exe;vsservppl.exe
AvastSvc.exe,coreServiceShell.exe;PccNTMon.exe;NTRTScan.exe,SAVAdminService.exe;SavS ervice.exe, fshoster32.exe,WRSA.exe,vkise.exe;isesrv.exe;cmdagent.exe
ByteFence.exe,MBAMService.exe;mbamgui.exe,fmon.exe,dwengine.exe;dwarkdaemon.exe;dw watcher.exe
This will be stored in the initialization struct, where it is labeled as m_ored_total_process_detection. Each bitfield inside this bitmask will be checked in different functions for specific operations as it will be seen below.
Once everything above has been done successfully, certain checks are done which determine the different variations for executing the "second stager"function:
- Code injection through entrypoint function hooking in newly created suspended process:
-Service creation and register of malicious status handlers. -Direct call to the second stager function.
Let's inspect some of these methods, and how they are being used.
Qakbot will decide based on the bitfields checks found, which processes to create for the code injection of the next important function.
In this sense, all the decrypted relative paths will use ExpandEnvironmentStringsW to get a full path to the binary to be executed, where 3 of them are returned through a WCHAR** pointer (WCHAR*[ ]), depending on the processes detected during the first process detection phase.
Once the process has been created with CreateProcess, using the CREATE_SUSPENDED flag, it uses NtCreateSection to create a section with the size of the current DLL Optional Header SizeOfImage, which is the size of the PE when it is mapped in memory.
In this sense, NtMapViewOfSection is used with this section handle to map the new section in the current process address space and in the remote address space.
After these steps finish successfully, VirtualAllocEx and WriteProcessMemory are being used to allocate virtual memory for a newly copied initialization struct inside our target process created, changing its member m_dll_module_handle to the address of the mapped section in the external process.
Additionally, to deal with relocations, first the dll is copied to the view of the section mapped in the current process using SizeOfImage in the Optional Header.
and after it, proper relocations will be done for the DLL using both the addresses of both views, effectively fixing the DLL for usage in the external process.
After this process has finished successfully, Qakbot will proceed to hook the entrypoint function, using the ThreadContext structure, specifically the EAX register, which contains the entrypoint address of the external process, patching the bytes with NtProtectVirtualMemory and NtWriteVirtualMemory, and eventually resuming the main thread with the hooked entrypoint in the remote process (the hooking function is inside the mapped dll)
On the other hand, if the dll has system privileges, it will proceed to start a service control dispatcher with StartServiceCtrlDispatcherA. In this sense, this dispatcher will call RegisterServiceCtrlHandlerA and set the current state of the service.
The most interesting detail is that, once the SERVICE_RUNNING status is set**,** the execution continues to the next important function.
All of these executions lead to the Second Stager function itself, which contains a lot of functionality that we will be describing next.
Here you can see in the screenshots above how precisely it is done the operations for the mappings of the section which contains the current stager dll.
To be completely precise, the HookFunction (shown below) will be executed in the context of the remote mapped dll thanks to the calculations shown in the BytesHook[1] operation. (shown above)
Additionally, one of the most important things to remember is the m_detected_flag member in the initialization struct.
This will be important for operations as you will see later, where both ServiceProc and the normal call to this function will set this member to be 1, meanwhile, for the hook case, the value has to be 2.
It is important to mention that the event signaled in the HookFunction will be checked while spawning and hooking processes, and if the event is found, it will stop to try spawning new target processes with the 3 attempts. Additionally the HookFunction will resolve the IAT for this mapped dll, so all necessary functionalities can be executed properly.
Inside the Execution::FirstStager function image, this function is labeled as
Injection::ManualMapDllAndHookEntrypoint()
Let's understand some of the most important functionality that can be seen in this procedure.
It is not too clear how it is possible to differentiate what would be executed depending on certain conditions for a Qakbot instance.
For this purpose, in the second stage function, the Dll will check what is going to continue to be executed, based on the stored config in the current system, which involves events, mutexes and registry values. It is also important to describe that before checking this
The execution state is set mainly with following 4 options:
First instance (no container executable found in registry)
Same container executable name as registry.
Mutex based on computer info already exists (initialized in the execution state 3)
Event already signaled (mainly done inside an exception handler function that we will see later in section)
All these conditions determine which parts of the code are going to be executed, for now, we will only focus on the execution state 0, since this is the one that relates to the first execution flow.
Additionally, I consider important to describe that before the execution state is set, if the member in the initialization struct, m_detected_flag, is not equal to 1 (Hook entrypoint case), it will proceed to generate a new buffer in memory of the current container executable, which will be used for later operations.
For each case, the results (assuming everything works out as intended) are:
- Execution state 0: Proceed to spreader component, then check for member m_detected_flag != 1 (Hook Entrypoint case), if this is the case, then it proceeds to delete schtasks/run key persistence set in one of the generated regsvr32 execution instances, then proceed to execute the Execution::ThirdStager function, additionally registering an exception handler*.***
- Execution state 1: Check for member m_detected_flag != 1 (Hook Entrypoint case) , if this is the case, then it proceeds to delete schtasks/run key persistence from previously generated regsvr32 execution instances. After this proceeds to ***Execution::ThirdStager()
- Execution state 2: Returns 0 (EXIT_SUCCESS).
- Execution state 3: Proceeds to signal event, where the GUID uses the computer info hash as seed.
Then proceeds to create the mutex that is relevant to achieve execution state 2.
Replaces container executable if the value stored in the registry config is not the same, storing the current one and then deleting the file from the previous value.
After that, it will execute Execution::ThirdStager, additionally registering an exception.
You can see all these details graphically, in the image at the start of this section, showing the Execution::SecondStager() function.
Qakbot at this point initializes a structure that will be used for storing and retrieving important information from the Windows Registry. This is very important to remember because it is used for retrieving specific configuration that can be used by different instances of this stager.
During the constructor phase, the member HashKeyToDecryptSubkey is used to encrypt the subkeyValue to be created. The HashKeyToDecryptSubkey value comes from a CRC32 based on the computer information.
This hash value is used for each BYTE in the encryption/decryption process, working it through from i = 0, with i & 3, this way we enforce always looping the 4 bytes of the hash.
When certain informations needs to be stored, the structure EncryptedConfig is used, where an specific index is passed through the function to create a SHA1 hash, using this hash as salt for generating the encryption key, which is used for encrypting the entire EncryptConfig buffer before being stored with an RC4 encryption algorithm.
After to then encrypt it and store it as a value in the corresponding value, where
EncryptedSubkeyValue, HashKeyToDecryptSubkey and dwLengthSubkeyValue are all used, from the ContainerSubkeyToStore, where RegOpenKeyExA and RegSetValueExA are used for this purpose.
At this point, the structure that I labeled as ResourceDecryption below, will be initialized. Everybody knows where the config is usually stored and how it usually is decrypted in the first instance, but what is the layout of the structure that uses it?
Simple, it's the ExtractedElements structures shown below, which basically store the index element and the string of the element.
If the decrypted config stores as an example: "10 = obama165", indexElement would be 10 (as an integer) and stringElement will contain obama165, for the current element struct.
I do not want to focus on all the possible methods that are used for decrypting and decompressing the payloads depending on each specific case (BriefLZ can be still used and size of 0x28 is also checked for the resource buffer to do an entirely different decryption operation, RC4 is also done in this case).
I want to focus on how the config is extracted for the main usual case, which is what most people are interested in:
A key is decrypted at runtime with one of the CHAR decrypt string wrapper, which is important for decryption.
This key (20 bytes) will be processed using SHA-1 algorithm, and then it will decrypt the buffer with RC4 algorithm, and it will additionally use SHA-1 for integrity checks, the core payload is finally obtained to be parsed correctly for proper usage by the Qakbot stager, which is mainly done through the ExtractedElements structure.
To reimplement it, it is just required for you to extract this key statically (or dynamically), and then replicate the same as what it is executed in the function. I will show below how it looks so you get a feeling of it when you are reversing it.
Above you can see the function that is in charge of both decrypting and checking the integrity of the decrypted payload, in my IDB this is labeled as Decryption::DecryptBufferAndCheckIntegrity, as you can see in the image)
I recommend the following resource for more description related to this specific decryption part: [https://darkopcodes.wordpress.com/2020/06/07/malware-analysis-qakbot-part-2/]
This resource describes a little bit better how old the code is, and how it has not really changed that much. However, it will most likely change in the future, so it is important that you reverse it on your own, so you can get a grasp of it
Qakbot stager will try to detect aswhookx.dll and aswhooka.dll in the current process where it is loaded, if a related process has been found in the detection bitmask described before.
At this point I consider it important to describe the function Storage::GenerateNewDllandRegistryPaths, because it is used in other components such as the network spreader.
First of all, a random dll name will be generated using different operations that use the current account name and CRC32 hashing. This random dll name will be concatenated with a newly created folder name, to eventually generate a working path for usage.
Having in mind the picture above, you can see that the folder name depends on the random dll name generated previously. This is shown in the function Random::GenerateRandom16LengthString. This random folder will be created in the same function with CreateDirectoryW, if it does not exist.
Additionally, if certain processes are found with the bitmask member inside the initialization struct and the privileges are enough, the folder is added in the following HKLM registry keys :
1.SOFTWARE\Microsoft\Microsoft Antimalware\Exclusions\Paths
2.SOFTWARE\Microsoft\Windows Defender\Exclusions\Paths
After this steps have finished, a ContainerSubkeyToStore constructor is called, where some of the information initialized at this function is used as arguments, mainly using the Profile Path (the profile path is obtained with ProfileImagePath at the start of the function), the full path of the random dll and the crc32 hash generated from the computer info.
Using this structure, some configuration is stored in the registry, which includes decrypted c2 config information in the resources for both the current stager and .cfg file (if found), random dll path, and time.
At this point, depending on the member m_detected_flag described before, it can work out some of the relevant stager files in different ways. This also depends on the current SID of the current user (m_checks_sid_option).
There are two main ways this is done by Qakbot:
The first one, which is mainly done for the hook entrypoint case, involves using some in-memory buffer of a file, to create and write to the current random dll.
The additional details of how this is generated for each case are detailed on a previous chapter of this paper, but essentially, for this case, the buffer used will be the current container executable.
It is important to mention that this function will be very important for regsvr32 execution, because it can generate new execution instances through persistence, which will be described before
The second one is much more complex, and involves invoking different function pointers, which are called in pairs, until one of them succeeds.
Each one passes the random dll path and the current executable path that has loaded the dll I will describe individually all the methods, without a particular order:
1.-Generating bat file with contents:
wmic process call create 'expand <container executable> <random dll path>', then creating a process with the bat file.
After it, it proceeds to try to delete the bat file.
2.-Creates a memory buffer of the container executable, and then it proceeds to write it to the target dll, effectively generating a new copy.
3. Using a vbs file which is executed through cscript.exe using ShellExecuteW. The main that this contains is the following vb code:
4.-Use CopyFileW to copy files from the container executable to target dll.
It is important to point out that one array entry will be added in an dynamic array of structures after all this process has finished correctly, increasing a counter. We do not really care about the layout of this struct because it is not explicitly used, but we will point out the usage of the counter in the next section, which is labeled as indexCryptoStruct.
These are some more important details that can be helpful for any reverse engineer that wants to mess around more around the main Qakbot dll stager.
My general advice is to look at the functions entries described before, some of these functions contain the methods we need to be aware of for doing proper C2 emulation, including the infamous sysinfo struct that is constantly changing.
The best way to deal with the network communications could be doing some binary rewriting
(specially to spot which parts of the functions are creating the JSON and encrypting it)
(I recommend looking at the Kaspersky report of this malware [(link here)] for a general description on how this is done.)
It describes the communication very well, and there are even more sources, so just look them up and reverse engineer the proper methods as well if you are interested in creating your own emulator)
Inside one of the c2 handlers of the stager, we can describe the structures that contain the IPs.
The main IP config structure looks like, which includes both registry and resources retrieved C2 servers.
This custom_ip struct is used in one function with inet_ntoa (which was in the IAT of the binary, at least until March 2022).
Of course, since there are multiple IPs stored for comms, an array of structs of type IpConfig is used.
For successful extraction, you need to rewrite the operation related to the ports in a Python script, mainly.
This structure is used in one method inside one function entry of the methods already described above, so it is important that from there you reverse engineer the rest.
Additionally, the communications are done through JSON so if you want an additional challenge you can reverse the different structures/objects employed for building it.
We saw earlier in the second stage and third stage function that Qakbot writes files to disk from a buffer of memory, but from where and how is it done?
There are two main buffers used but both of them use the following structure.
The first buffer which is commonly used for all the functions previously, such as
Execution::GeneratePayloadsAndRegsvrInstanceDependingCheck, is copied from the container executable, while the second case will use the buffer obtained through one of the function_entry methods already described. (You will notice both structures when you xref it).
It is important to mention how this fileStruct structure is used for the container executable buffer, in the HookEntrypoint case before the proper execution state is selected in the second stage function.
First of all, the first 1024 bytes will be read from the current executable container, and then it will proceed to read armstream.dll from the system32 directory.
Once this has finished, if the size is more than 1024, it will proceed to copy from the next 1024 bytes from the start for armstream.dll, to this new payload buffer.
Next, it will proceed to write the container executable in disk until it reaches the maximum size of 4096
Because it is written on disk, it makes sense the container executable is corrupted and needs to be rewritten in other instances.
This function will be used both in the SecondStage and ThirdStage function, you will notice it 2 times when reversing the stager.
The main operation is set through GetKeyboardLayoutList(), and will verify for specific IDs.
The main keyboard layout IDs detected this way are found in CIS countries, and these are: