As i referred on my previous post, i have started a Reverse Engineering/Malware Analysis journey. One of the topics i find most interesting about malware is packing. A packer is an algorithm that manipulates a simple binary and adds one or more of the following features (this is not an exhaustive list):
- cryptors and/or compressors: e.g. to hide internal strings/shellcode
- anti-debugging: e.g. to detect if the malware is being debugged
- anti-vm: e.g. to detect if the malware is running inside a Virtual Machine
- anti-dumping: e.g. to prevent analysts from dumping a malware by destroying binary headers/code.
Packed malware has, typically, a small amount of imports (such as LoadLibrary and GetProcAddress) which are used to resolve dependencies in runtime. The process by which a packed malware resolves its dependencies and restores its own malicious payload is called unpacking.
Enough 101. This is my first post about malware analysis and i hope to write more in the future when i find some interesting specimen. As you may expect, this malware is packed. Unpacking does not require a massive post if i do it blindly but, if i want to understand what is going on, that requires deeper analysis and more time.
For the more experienced: you will not see me using fancy tools/plugins for the first set of posts since, for beginners, it is best to use as little automation as possible in order to understand the unpacking patterns and tricks.
I will describe the analysis process of the unpacked malware on a future post. I am going to split this analysis in four stages since the unpacking process is done in that manner.
Family name: DoFoil a.k.a Smoke Loader
Packing algorithm: custom
Tools: OllyDbg + OllyDump, IDA Pro
Environment: VMware with 64-bit Windows 7
I always start by opening the malware with IDA and check the imports:
Does not look packed but it also does not look too harmful either. Some Certificate manipulation, file enumeration, registry manipulation. Let us look at the entrypoint:
The LocalFileTimeToFileTime and lstrcmp are omnipresent across the packed sample and appear to have no real usage (garbage code). IDA did not decode the chunks of bytes you see. loc_405B25 is called if the lower word for the stack pointer is lower or equal to 0xFE00. At the time of writing i am still not sure what is the reason for this verification. I would guess the lower word for the stack address varies across Windows versions, which would allow for a simple verification of the OS version. If you know what is this verification, please, feel free to drop a comment below. If i find out in the future, i will let you know. loc_405B25 is depicted below:
This pattern is repeated over and over again (like a Russian doll) with slight changes on some instructions. This seemed like a dead end. However, if the condition is not verified, sub_405065 is called. This happened on the test environment and on a Windows XP. I did not use any plugins or special configurations to avoid anti-analysis techniques so i assume this first verification is not such.
The function sub_405065 contains code used to load libraries. The dlls and functions to be imported and resolved are slightly “obfuscated”. For instance, an ‘a’ was prepended to the hardcoded string sycfilt.dll before calling LoadLibrary to load asycfilt.dll. Failing to load a given dll would lead to a jump to tiny routine that would set the ebp to the return of LoadLibrary (zero since it failed) and then return, which would crash the process later (stack cannot be based on 0x0 address). I have called failover to this tiny function but i later realized that this is not a failover mechanism (it does not recover).
asycfilt.dll, according to the table of exports, seems like a pretty useless dll to load and the next loaded dll (attempted load), gsycfilt.dll is even more strange since it does not appear to exist. For the latter, the failover function is called if the dll is successfully loaded. My guess is that the existence of the dll is a sign of infection and it signals the malware for the fact that the computer is already infected (just guessing). asycfilt.dll does not appear to be used later. My guess once more is that this dll is present only on some versions of Windows, being a means to test the version of the OS.
Next, the malware resolves ReadProcessMemory (after loading kernel32.dll) by deobfuscating wwwProcessMemory with some easy but geeky arithmetics. The malware then proceeds to get the relocation table address and size for asycfilt.dll and checks the size against an interval of possible sizes (first figure below). The address of VirtualAlloc is then resolved. At the top of the middle figure you can see a fancy call to VirtualAlloc with some arithmetics to hide its arguments. Obfuscation aside, the call is
The last parameter can’t provide a more obvious conclusion: the malware is about to unpack some shellcode and write it to that section. The loop you see below the VirtualAlloc call is exactly that. The malware is looping across dwords, starting on 0x408E44 and applying a bunch of operations. I will not delve into this since the operations are straight forward and add no meaning to this post. At last (last picture), LoadLibrary address is pushed onto the stack and the execution jumps to the beginning of the decoded shellcode.
At this point, i strongly advise you to take snapshot of the VM because, an anti-VM/debugging technique on the next code and you have to redo everything. I like to look at IDA while i am debugging so, i need to dump the contents of the allocated section. Once you are at the beginning of the allocated section, go to memory view and double click the beginning of the section containing the code. Select the bytes -> righ-click -> Backup -> Save data to file. Open the .mem file with IDA in Binary mode.
The shellcode uses LoadLibrary to get a handler for kernel32.dll. The name of the library is hardcoded but IDA interprets it as code at first. Below you can see part of the shellcode code. On the left is the interpretation of IDA, on the right is “my interpretation”.
The same behaviour is observed a few lines below:
The usage for these chunks becomes clear when they are looped through and passed on to the function below:
I have commented everything so you can understand what happens. In order to reach these conclusions, i advise you to open kernel32.dll on IDA pro and rebase it to be on the same address as in the debugger. From then on you must leverage IDA (with kernel32.dll + the code above open) as well as OllyDBG to reach my conclusions. Simply put, the malware loops through kernel32.dll exported names, computes a checksum for the strings representing those names (green rectangle depicts the algorithm) and compares the checksum with the one provided and hardcoded. Once it finds a match, it gets the ordinal for that function and uses it to index the exported functions table to get the address of that function. The mov before popa puts esi on eax (previously saved using pusha). Then, the location containing the hardcoded checksum is overwritten with the function address (stored on eax). The malware resolves the addresses for the following functions:
Once the resolution is performed, the malware checks the preamble for ReadFile function against 8Bh (mov?). Then, it proceeds to resolve some more functions (the names for those are hardcoded). We have an ending similar to the first stage: allocation of RWE memory and copy of shellcode to that address. The execution then jumps to that code:
We are done here. Time for VM snapshot.
The third stage is a bit long. The new shellcode leverages some of the resolved functions to replace a big chunk of the malware binary loaded at first with the malware to be executed on the last stage. The picture below on the left shows the first chunk of the shellcode obtained from the second stage. As you can see, the shellcode obtains a pointer for a memory location inside the .text section of the loaded binary. Some of the data in that location copied and decoded. The right picture shows the decoded contents.
Yes, the malware contained a binary inside itself. You can dump that binary using the process i described previously and open it with IDA. It is a well formed binary and IDA will not complain. IDA recognises the start function but nothing else. That start function is part of the last stage of the unpacking process.
The shellcode then proceeds to overwrite the necessary sections with the contents of the sections in the embedded binary (only .text is overwritten) and header adjustments are performed to comply with the embedded binary specifications. I will not delve into a thorough explanation of what is happening. The pictures below should suffice. The third stage is mostly overwriting shellcode with shellcode and make sure nothing blows.
The last jmp is used to jump to the entrypoint of the embedded executable, now part of the old malware .text section. If you have OllyDump, you will be able to dump the process easily once the EIP is at the beginning of the newly-decoded shellcode. IDA will accept the binary without complaining. Whoever wrote all this code deserves the slow clap.
Mind that, even though this is seems like a standalone executable, do not try to run it by itself since it has memory references that were adjusted by the unpacking algorithm. In my case, if the new executable is loaded at an address that is different from 0x00400000, the process will crash. The high degree of dependencies accross shellcode sections makes the analysis of this sample pretty cumbersome.
We have reached the final stage of unpacking. OllyDbg won’t show you meaningful instructions when you do the Right-click->Analysis->Analyse code trick. As such, you will have to use IDA and rebase the code at the same address as the malware in memory. In my case, the malware image base is on 0x00400000. The first chunk of code is just a routine that xors an encoded segment of code with 0x313B6535. Then, the malware calls that decoded segment. See the pictures below:
You could either decode the shellcode with IDA scripting or let the malware do it itself and then dump the process again. The shellcode is now recognisable by OllyDBG if you use the code analysis trick. The shellcode that is executed next is quite interesting. First, it leverages IsBeingDebugged flag and NtGlobalFlags to detect debugger activity. If you have anti-anti-debugging plugins, you should be ok to jump directly to the end. Otherwise, i recommend, either patching or setting breakpoints on the lines with calls and setting the eax registry to zero (no debugger).
Once you pass this phase you may now ask. Are we there yet? Well…no. Another chunk of code separates us from the real deal. This malware has a tricky MO. It contains tons of shellcode that is unpacked in runtime and jumps from memory section to memory section. During this last part, the malware starts by unpacking a chunk of shellcode to an allocated section of memory (wow!). Part of that shellcode contains the unpacked payload that we want. It also contains parameters to set up the latter properly. Let use see some IDA, shall we?
This is the first part of the unpacked shellcode. ebx is still fs:[30h]. As such, the last line appears to mean: fs:[30h]->PEB_LDR_DATA->InLoadOrder->SizeOfImage + fs:[30h]. Drop a comment below if you know what is happening here. Then we have the called function:
From top to bottom, left to right:
1.The running shellcode performs a first decoding of another chunk of shellcode using xor operation with key 0x313B6535. The decoding is in place.
2.A second decoding is performed and the contents of the shellcode are copied to an allocated memory section. Another memory allocation is performed which will hold the unpacked payload that we want. The unpacked payload is copied from the previously referred shellcode section.
3 and 4. The unpacked payload memory references are adjusted relative to the beginning of the binary in memory. Something tells me that the binary will be referenced later. Then, the malware jumps to the payload. Before that, it deletes all the code previously described as can be seen below.
Before jumping to the final shellcode, the shellcode pushes three values on the stack: a pointer to a string “22222” inside the loaded binary memory space, a pointer to the beginning of the memory section containing part of the old shellcode and zero. Before jumping or at the beginning of the new shellcode, i recommend you make another snapshot. The real analysis starts now but it will not be part of this article since most of you may be lost or asleep by now.
At this point the malware is unpacked and will execute the main payload. However, it will not make it easy on you. There is no point in using OllyDump to dump the binary because the code is in another section. You will have to dump the code using the procedure i described before. However, you should not dump at the beginning of the code. Take a look at the pictures below:
As you can see, the malware performs the address resolutions for some functions. The resolution appears to be similar to what we have seen before: computing a checksum on the name of the functions that are part of the dlls in memory and compare that checksum with an hardcoded one. Once you run this function, you can dump the memory section containing the shellcode.
If you use just a standalone debugger or if you use IDA with one of its debuggers, you can skip this paragraph. If, as me, you switch between IDA and OllyDBG, or use another combination of disassembler and debugger, you will have to rename the function offset names to something meaningful. The way i see it, you can do this in one of two ways:
- Go through the offsets table and for each offset go to (in OllyDBG) View->Executable modules. Find out where is that offset in the list of dlls, right-click the dll->View names. Search for the offset and you have the name. Change the name given by IDA to that offset. Now rinse and repeat.
- Find the name of the function using the procedure on 1. Create an IDA script that goes through the table and renames the offsets to reflect your findings.
I intend to make another post for the analysis of the unpacked malware on another post but for now, we end here.
If you are familiar with WinDBG, you know that i could have suffered less if i had used it. This happens because WinDBG is able to show you OS internal structures and offsets which would have helped a lot in this case. I have not used it because i am more used to ImmunityDBG or OllyDBG but i hope to change that soon.
Some of the unpacking steps could have been skipped and the article would have been shorter. However, it is important to be comfortable about all these layers of unpacking. The analysis of this malware is far from trivial since it has a bunch of shellcode and i may have left you pretty confused. Also, the IDA views i posted with comments are meant to help you understand what is going on, assuming you understand my comments. Feel free to leave some feedback about the overall structure and contents.
Stay safe 😉