63 Problems But Malware Ain’t One: 882aef202a56008ad20a61c8960eb830

Hello paranoids

 As promised i am back to reverse stuff. This time, and following the previous sequence of posts, i have decided to pick an Android malware for analysis. Without further ado, let us begin.

Malware Characteristics

MD5: 882aef202a56008ad20a61c8960eb830
Family name: Ginmaster (GingerMaster)
Obfuscation/Packing: Yes/No

 GingerMaster was the first Android malware using GingerBreak exploit, an exploit that affects GingerBread (Android 2.3). The malware is capable of downloading, installing and launching APKs without users permission using low level tools like pm (package manager), sh (shell) and am (activity manager). The malware has their classes’ and functions’ names obfuscated using small chains of random characters. Contacted URLs and other relevant strings are in plaintext.

 As always, i have uploaded the IDA database to my GitHub repository.

Analysis Environment

Tools: Android Studio (2.3.2) and sdk tools (e.g. adb, aapt), IDA Pro 6.4.130111-32-bit, JD-GUI 1.4.0, aapt, dex2-jar, apk tool,
Environment: VMware with 64-bit Windows 7
Emulator and Libraries: Nexus 5X, Gingerbread (2.3.3 with GoogleAPIs, API Level 10),

Objectives

The objective of this post is to understand how the malware works, i.e.:

  • Files created
  • URLs contacted
  • Exfiltrated information
  • Services created and what they do
  • Usage of GingerBreak, an exploit used to obtain root on Android 2.3 (Gingerbread)

Preliminary Analysis

 Before diving into tools and code, it is worth checking the activity generated by the malware when executing. We start by creating a emulator and using adb to deploy the application. I would like to see how the application works and the traffic it generates (e.g. using Wireshark). It is cumbersome to track the source of the traffic (i.e. malicious application vs. legitimate emulator behaviour).

 client[.]mustmobile[.]com is a domain widely used by the malware and since there is no DNS resolution for it, no further traffic will be seen. As for host indicators, the Android Device Manager is of great help:

SDCardLogCat

 The first picture is contents of the SD card. Since i already know that the malware uses the SD card to store files and perform the exploitation we are expected to see files there. install.sh, installsoft.sh, runme.sh and gbfm.sh are of utter importance.  It also uses SQLLite to store configurations and lists of packages for download (databases folder).

 The malware writes far too much on the logs and Logcat (integrated on Android Studio) is useful to get them. As you can see GameService has been spawned (picture below). The command:

adb shell dumpsys activity services

is helpful to get a dump of the activities and services. We can see there that GameService is running and Web has bound to it (refer to service binding for more information).

Package Structure Analysis

I will assume you read the previous articles and that you are familiar with the usage of the tools i referred there.

As soon as you start looking at the application package you will realise it does not have a lot to look at. Once you convert the dex file to a jar, you will see about 106 classes under the same package com.igamepower.appmaster. You can spot some degree of obfuscation by noticing:

  • random name classes (e.g. af.class, e.class, f.class)
  • mix of Java and smali code. This is indicative that the conversion from dex to Java failed due to anti-reversing techniques

JavaAndSmali

 While the number of classes is high, the absence of multiple packages makes the analysis easier. The first version of Ginmaster i looked at (MD5 17ca4c91367ba4b91bdb6e0b77aaafb6) while having many more packages, was successfully decompiled into Java. I think for learning purposes, analysis of SMALI is a much more interesting exercise.

Looking at the decoded Manifest we can see the following permissions:

Permission String Permission (according to docs)
READ_PHONE_STATE
  • Phone number
  • Cellular network information
  • Status of ongoing calls
READ_LOGS  Allows application to read low-level logs.
ACCESS_CACHE_FILESYSTEM
DELETE_CACHE_FILES
 Self-explanatory.
WRITE_SECURE_SETTINGS
  • ADB Status
  • WIFI
  • Parental Control
ACCESS_NETWORK_STATE  Query network information (e.g. check if connected).
INTERNET Allows the creation of sockets.
WRITE_EXTERNAL_STORAGE  Self-explanatory.
MOUNT_UNMOUNT_FILESYSTEMS Self-explanatory. The malware requires access to external SD cards.
READ_OWNER_DATA
WRITE_OWNER_DATA
User email, name, etc
WRITE_SETTINGS  Allows the modification device settings
INSTALL_SHORTCUT
UNINSTALL_SHORTCUT
 Allows the creation of a shortcut on Launcher (the board with applications)
RECEIVE_BOOT_COMPLETED  Allows application to receive a notification when the device finishes booting
RESTART_PACKAGES Allows the app to end background processes of other apps.

In terms of string resources we have little to none and they are in chinese. Feel free to use Google translate to check the translation.

In terms of activities/services/receivers, we have:

Type Name Intent Filter(s)
Activity Myhall action:MAIN
category:DEFAULT
Activity HomeActivity  
Activity SortActivity1 action:MAIN
Activity SortActivity2 action:MAIN
Activity SearchActivity action:MAIN
Activity ManagerActivity action:MAIN
Activity GameInfo action:MAIN
Activity TableClass action:MAIN
Activity Web action:MAIN
category:LAUNCHER
Activity GameAlertDialog  
Activity TestView action:MAIN
Service GameService action:MAIN
category:LAUNCHER
Receiver GameBootReceiver action:BOOT_COMPLETED
Activity DevelopmentSettings (not found on the package) action:APPLICATION_DEVELOPMENT_SETTINGS

 The usage of the intent filters you see for GameService is not clear to me.  MAIN and LAUNCHER are associated with the activity responsible for the first screen. Services cannot be launched directly through Launcher icons. Upon looking at the Java code for Web, i have seen the service GameService being explicitly launched as i will refer on the next section. I am assuming this was a mistake of the authors that turned out not to crash the application.

 The usage of DEFAULT category on Myhall is intended to mark that activity as a potential candidate to receive implicit intents for the MAIN action. i.e, if another application or even this one creates an intent with the action MAIN, all applications with a MAIN action and the DEFAULT  category will receive the intent.

 GameBootReceiver is a broadcast receiver. Broadcast receivers are used by applications to receive broadcasted intents from other applications (e.g. OS reporting battery status). In this case, BOOT_COMPLETED is an action used on broadcasts to notify applications of a complete system boot (once it is finished).

 Finally, DevelopmentSettings has APPLICATION_DEVELOPMENT_SETTINGS action which tells us the class is able to mess with application development settings on Android. This activity was not on the APK.

    I will not overview the assets folder just now because i am referring it later. Enough high-level analysis, let us dive into details.

Java SMALI Code Analysis

 Unfortunately, this is one of the cases where the analysis of Java will confuse you more than actually help. We need to look at SMALI. We can use IDA to disassemble the .dex file. As previously referred the first Activity to be launched is Web. It starts by collecting device information and some APK details:

  • IMSI
  • SIM serial number
  • Phone number
  • Network type
  • CPU serial (ignored and set to IMSI if CPU serial is 0000000000000000)
  • Pixels (width, height)
  • Version code of the APK
  • Service channel 1004 (stored as a resource on the APK)
  • Current Time

 These details are posted to the URL we have seen before: http://client%5B.%5Dmustmobile%5B.%5Dcom/mt.php. Another URL (http://client%5B.%5Dmustmobile%5B.%5Dcom/request/update.do) is beaconed with the same details  and the response after parsed is written to /data/data/com.igamepower.appmaster/files/cache/igamepower_file/8888. The purpose of this file is not clear since it is not opened anywhere else. However, the processed response is passed to ac handler with code 0x1 which leads to the download of a package named com.igamepower.appmaster of the version of the current malware is lower than the one hosted on the server (malware update):

CheckPackageVersionDownloadUpdatedVersion

 Web class is also responsible for spawning GameService service. This service launches a couple of threads to contact the URLs on the following table. The aq thread is responsible for performing the requests while the an handler processes them based on codes.  All beacons contain at least the collected data i have previously referred. All the URLs are for resources hosted on http://client.mustmobile%5B.%5Dcom.

Resource Data Sent and Purpose Handler Code and Processing
report/first_run.do Data Sent: Sends device details only.

Purpose: Beaconed when the application runs for the first time.

Code: 0x1 (default case).

Processing: Nothing is done.

report/uninstall_success.do
report/install_success.do
Data Sent: Package Information.

Purpose: Beaconed when packages are installed or removed. Information about those packages is sent (e.g. name).

Code: 0x0 (default case).

Processing: Nothing is done.

request/config.do Data Sent: Key-value pair action:config.

Purpose: Updates configuration parameters.

Code: 0x3EA.

Processing: Shared Preferences are updated with fields such as:

  • get_list_limit
  • get_config_limit
  • get_list_limit
  • server_domain
request/push.do Data Sent: Sends SharedPreferences field soft_last_id.

Purpose: Informs the server of the last id associated with the last software on the downloaded lists of software.

Code: 0x3EC.

Processing: Seems to be used to pull a list of software. The downloaded metadata is converted into shortcuts that when clicked redirect to the malware activities that display information about them.

report/install_list.do Data Sent: Result of “select * from game_package where status=1” is sent. status is set to 1 when there is a package installation.

Purpose: Likely to inform the remote server of the list of packages installed by the malware.

Code: 0x1 (default case).

Processing: Nothing is done.

request/alert.do Data Sent: Sends to the server a field from SharedPreferences: alert_last_id.

Purpose: This seems like a means to pull notifications.

Code: 0x3EB.

Processing: A notification is shown through GameAlertDialog.

 It is GameService which deploys GameBootReceiver. GameBootReceiver is responsible for launching GameService upon boot. What is puzzling about this malware is that onReceive implementation has code to deal with installed and removed packages (i.e. Manifest permissions PACKAGE_ADDED and PACKAGE_REMOVED). However, the manifest states that this receiver is only able to BOOT_COMPLETED.

 More network activity may be generated by MyHall. MyHall is a TabActivity which creates multiple tabs within itself that when clicked launch one of the following activities:

  • HomeActivity: An activity and a Thread. The thread component is launched within the onCreate method.  This thread beacons http://client.mustmobile%5B.%5Dcom/request/index.do and and appears to be used to pull a list of applications to be displayed on the HomeActivity activity (check bv.a with code 0x3E8).
  • SearchActivity: The user can use this one to search for applications. The queries are posted to http://client.mustmobile%5B.%5Dcom/client.php?action=softlist&type=search&word=.
  • ManagerActivity: Displays packages installed on the phone and shows more to be installed. It also allows the packages to be launched and deleted directly from the activity. 

 GameInfo queries http://client.mustmobile%5B.%5Dcom/client.php?action=soft&soft_id= to get information about a given application to be displayed to the user. SortActivity2 uses the URL http://client.mustmobile%5B.%5Dcom/client.php?action=softlist to get a list of software.

 There is a URL http://apk.mustmobile%5B.%5Dcom/apk/20110705/19225910801.apk that appears on TestView.onClick:

Baidu

 According to OSINT, bdmobile.android.app is Baidu. When a certain button (identified by 0x7f0b0061) on this TestView is clicked, bdmobile.android.app is downloaded and deployed. What puzzles me here is that the activity is not created by any class within the APK so, apparently, there is no way for this branch to be executed.

 So far we have overviewed network indicators and some host indicators (i.e. files created). We now know what details are beaconed to the server. GingerMaster has an interesting feature that we have not overviewed yet.

GingerBreak Exploit

 GingerMaster was the first malware leveraging GingerBreak exploit to obtain root permissions on the host. GingerBreak affects Android 2.3 Gingerbread. For it to work, the device must have an SD card inserted and USB debugging must be enabled which explains what i am about to overview. Going back to GameService.onCreate routine we see:

Multiple png files are moved from the assets folder to /data/data/com.igamepower.appmaster/files/ with .sh extensions. Then, the following command is executed:

chmod 775 /data/data/com.igamepower.appmaster/files/gbfm.sh /data/data/com.igamepower.appmaster/files/install.sh /data/data/com.igamepower.appmaster/files/installsoft.sh /data/data/com.igamepower.appmaster/files/runme.sh

When looking at the assets folder inside the APK, we can see four PNG files:

File Name MD5 Type Functionality
gbfm.png fa355f01ec16bcc09fa0a2341f0ceb40 ELF GingerBreak Exploit
install.png 725bee6d16deb8eb0f4e869fa412a71b Bash Script

Basically moves /system/bin/sh around.

installsoft.png 25bcbf1d0a3297c8b93e3999aa750974 Bash Script

Install APK passed as argument.

runme.png 3674d33c271a0c3c8f06c6ff7276e2b8 ELF  /system/bin/sh (see below)

runme.png is a very simple ELF program (fancy some ARM?):

runme 

 As far as i understand, the malware uses GingerBreak to install APKs on the host without requiring a specific permission on the Manifest or PlayStore using low level tools such as sh, am and pm. If the malware updates itself, it also leverages the exploit to escalate privileges.

 Three classes pay a vital role on the deployment of APKs:

GameService: More specifically the a method. This method is used to download and deploy the APKs using f thread. Below you can see the functions that call a (called DownloadsAndDeploysAPKs on the diagram):

DownloadsAndDeploysAPKs

Downloads are therefore performed as a consequence of the events:

  • TestView.onClick: As previously referred, this seems to download Baidu. However, i see no traces of TestView being used on the program.
  • aa.handleMessage: Associated with the update of the application when http://client%5B.%5Dmustmobile%5B.%5Dcom/request/update.do is beaconed. aa is launched from Myhall.onCreate.
  • ao.onClick: Likely associated with the request for an application made on the GameInfo activity, i.e., user checks application information and clicks button to download.
  • bg.onClick: Used on SortActivity2 which displays lists of applications. Likely similar to the previous event.
  • ca.handleMessage: Associated with the update of the application when http://client%5B.%5Dmustmobile%5B.%5Dcom/request/update.do is beaconed. ca is launched from Web.onCreate.
  • cb.onClick: Associated with packages installed from ManagerActivity.

cj: Thread spawned on GameService.onCreate. Its purpose is to trick the user into enabling USB debugging (if not enabled) by showing notifications and launching Application Development Settings. It also checks whether there is an external SD card (remember this?). Once the debugging is enabled, the exploit is launched using the routine e from cj. gbfm.sh is executed. Once the exploit is finished, install.sh is also executed. Other functions within cj, such as c and are worth mentioning because they are related to the execution of the scripts.

f: This thread is responsible for performing the download of the APK, the writing and the installation using installsoft.sh. Once the APK is downloaded, the thread checks for the ROOT status. ROOT_STATE_GINGER and ROOT_STATE_PERFECT are processed similarly as can be seen below:

ROOT_STATE_SU is processed with:

 The code and therefore the purpose is similar (deploy package using/data/data/com.igamepower.appmaster/files/installsoft.sh). What changes is the interpreters used:

  • ROOT_STATE_PERFECT: /system/xbin/appmaster/sh
  • ROOT_STATE_GINGER: /data/data/com.igamepower.appmaster/sh
  • ROOT_STATE_SU: /system/bin/su -c

  ROOT_STATE_SU indicates that the application is already running as root and, therefore, it only needs to use the standard /system/bin/suROOT_STATE_PERFECT is set when the system is rooted and when the  file /system/xbin/appmaster/sh is readable (install.sh executed completely). If the install.sh fails to create /system/xbin/appmaster/sh, ROOT_STATE_GINGER is the root state.  The reason for the malware copying /system/bin/sh to /data/data/com.igamepower.appmaster/files/ and /system/xbin/appmaster is not clear to me.

Final Notes

 Since the only issue you have to deal with is obfuscated classes, you can analyse this malware without stepping through the code. Also, since the domains used by the malware are already down, you would not be able to see anything meaningful. The malware pulls lots of configurations and without them you are left with a malfunctioning state machine. Analysis of logs using Logcat can still be useful to understand what activities and services are launched for this specific malware since it uses the Logging capabilities a lot.

 Even though this was my first attempt at reversing Android malware, i have felt that reading SMALI is much simpler than reading x86 which is why i have relied solely on static analysis and a bit of dynamic analysis by using an emulator. As far as i know, it is not possible to change the instruction pointer and execute selective chunks of code as you can do on x86. We can however take small chunks of SMALI, compile them and execute them.  

Stay safe 😉

63 Problems But Malware Ain’t One: 8ca23d7bdf520c3e7ac538c1ceb7b555

Hello paranoids

Recovered from my previous post? No? Great! My overall objectives for the previous post were to:

  • Show you how to unpack a malware
  • Unpacking constructions (e.g. anti-debugging, shellcode, dynamic resolution of dependencies)

On this post, i intend to:

  • Go over some network/host tracks left by the malware
  • Malware supported commands and features

The IDA database resulting from this analysis will be added to my GitHub repository here so you can check out the comments i left there. You may find that some functions changed name when compared to the pictures i provide. However, if you understand what i am saying here, the comments and function names on the database should be clear. Every URL on this article will be defanged using [] to surround one of the dots.

Characteristics

MD5: 8ca23d7bdf520c3e7ac538c1ceb7b555
Family name: DoFoil a.k.a Smoke Loader (unpacked shellcode)
Packing algorithm: custom

Analysis Environment

Tools: OllyDbg, IDA Pro
Environment: VMware with 64-bit Windows 7

Functions

If you read the previous post, you should remember how i got the functions used by this piece of malware. The malware does load new dlls or calls standard functions in any shady way. As such, this list is accurate:

atoi
CharLowerA
CloseHandle
CoCreateInstance
CoInitialize
CoUninitialize
ConvertStringSecurityDescriptorToSecurityDescriptorA
CopyFileA
CopyFileW
CreateDirectoryW
CreateEventA
CreateFileA
CreateFileMappingA
CreateFileW
CreateMutexA
CreateProcessInternalA
CreateProcessInternalW
CreateRemoteThread
CreateThread
CreateToolhelp32Snapshot
CryptAcquireContext
CryptCreateHash
CryptDestroyHash
CryptGetHashParam
CryptHashData
CryptReleaseContent
DeleteFileA
DeleteFileW
ExitProcess
ExpandEnvironmentStringsW
FreeLibrary
GetComputerNameA
GetCurrentProcessId
GetCurrentThreadId
GetFileAttributesExA
GetFileSize
GetForeGroundWindow
GetModuleFileNameA
GetModuleFileNameW
GetModuleHandleA
GetProcAddress
GetProcessHeap
GetSystemDirectory
GetTempFileNameA
GetTempFileNameW
GetTempPathA
GetTempPathW
GetTokenInformation
GetVersion
GetVolumeInformationA
LdrGetDllHandle
LdrProcessRelocationBlock
LoadLibrary
LoadLibraryW
lstrCmpW
lstrcatA
lstrcatW
lstrcmpA
lstrlenA
lstrlenW
MapViewOfFile
MultiByteToWideChar
ObtainUserAgentString
OpenFileMappingA
OpenProcess
OpenProcessToken
OpenThread
Process32First
Process32Next
ReadFile
ReadProcessMemory
RegCloseKey
RegCreateKeyA
RegEnumKeyA
RegEnumValueW
RegNotifyChangeKeyValue
RegOpenKeyA
RegOpenKeyExA
RegQueryValueExA
RegSetValueExW
ResumeThread
RtAllocateHeap
RtReallocateHeap
RtlAddVectoredExceptionHandler
RtlComputeCrc32
RtlFreeHeap
RtlGetLastWin32Error
RtlGetVersion
RtlMoveMemory
RtlRemoveVectoredExceptionHandler
RtlZeroMemory
SHGetFolderPathW
SetFileAttributes
SetFileAttributesA
SetFileTime
SetKernelObjectSecurity
ShellExecuteW
Sleep
SuspendThread
Thread32First
Thread32Next
VirtualAlloc
VirtualFree
VirtualProtect
VirtualQuery
VirtualQueryEx
WaitForSingleObjectEx
WinHTTPCloseHandle
WinHTTPConnect
WinHTTPCrackUrl
WinHTTPGetProxyForURL
WinHTTPOpen
WinHTTPOpenRequest
WinHTTPReadData
WinHTTPReceiveResponse
WinHTTPSendRequest
WinHTTPSetOption
WinHttpGetIEProxyConfigForCurrentUser
WriteFile
WriteProcessMemory
wsprintfW
wsprintfA
ZwCreateSection
ZwMapViewOfSection
ZwQueryInformationProcess
ZwQueueAPCThread
ZwUnmapViewOfSection

According to this list, we can infer the following capabilities:

  • Networking: WinHTTPConnect, WinHTTPReadData
  • File system manipulation: DeleteFileA, WriteFile
  • Registry manipulation: RegCreateKeyA, RegOpenKeyA
  • File mappings : CreateFileMappingA, MapViewOfFile
  • Processes/Thread enumeration: CreateToolhelp32Snapshot,  Process32First, Thread32First
  • Hashing and integrity computations: CryptCreateHash, RtlComputeCrc32
  • COM: CoInitialize, CoUninitialize

Some of the functions referred previously were not used by the sample i had so, an analysis based on hardcoded addresses for standard functions is not enough.

String Decoding

URLs, HTTP parameters, registry keys, file names, folder names and so on are kept encoded internally. The strings are decoded by three functions:

URLs (decoded by “url_encoder_decoder”):

Beacon format strings, registry related data (decoded “string_encoder_decoder_1”):

  • “%d#%s#%s#%d.%d#%d#%d#%d#%d#%d”
  • “%d#%s#%s#%d.%d#%d#%d#%d#%d#%s”
  • http://www.microsoft%5B.%5Dcom/”
  • “Software\Microsoft\Internet Explorer”
  • “Software”
  • “Software\Microsoft\Windows\CurrentVersion\Policies\Explorer\Run”
  • “Software\Microsoft\Windows\CurrentVersion\Run”
  • “Microsoft One Drive”
  • “Software\Microsoft\Windows\CurrentVersion\Uninstall”
  • “sample”
  • “System\CurrentControlSet\Services\Disk\Enum”
  • “advapi32.dll”
  • “Location:”
  • 2015
  • “plugin_size”
  • “explorer.exe”
  • “%s%08X%08X”
  • “%08X”
  • “Work”
  • “user32”
  • “shell32”
  • “advapi32”
  • “urlmon”
  • “ole32”
  • “winhttp”
  • “HelpLink”
  • “URLInfoAbout”
  • “sbiedll”
  • “dbghelp”
  • “qemu”
  • “virtual”
  • vmware”
  • “xen”
  • “ffffcce24”
  • “svcVersion”
  • “Version”
  • “Version”

Path format string, paths, shell commands and HTTP parameters (decoded by “string_encoder_decoder_2”):

  • “%s\%s”
  • “%s%s”
  • “regsvr32 /s %s”
  • “%s\%s.lnk”
  • “%APPDATA%\Microsoft”
  • “%TEMP%”
  • “%CompSpec%”
  • “.exe”
  • “.dll”
  • “/c start %s && exit”
  • “:Zone.Identifier”
  • “GET”
  • “POST”
  • “Content-Type: application/x-www-form-urlencoded”
  • “runas”
  • String Decoding

Both “string_decoder_1” and “string_decoder_2” call “encodes_decodes_string_using_rc4” with different four bytes keys. It is a redundancy to say that the last two sets of strings are internally encoded using RC4.

Preamble and Process Hollowing

The malware starts by resolving the dependencies i have referred previously on this article and then proceeds to check the Windows version. If the operating system is Windows Vista or above, the malware leverages the Windows Integrity Mechanism. You can find lots of resources online explaining this mechanism. The Windows Integrity Mechanism is similar to SELinux Mandatory Access Control where an object of lower integrity can’t interfere with an object of higher integrity. The notion of separation between non-privileged users and privileged users (e.g. administrators) has existed in Windows versions previous to Vista (e.g. user process cannot read/write files from administrator). However, starting on Windows vista, even when you have an account with administrative privileges, you get a prompt (our beloved UAC) every time you attempt to execute a binary. This is the mechanism in action, which attributes a default medium integrity to the applications launched by authenticated users.

The malware checks the level of integrity it runs at and then sets creates an empty DACL with the SE_DACL_PROTECTED attribute enabled. This prevents any inheritance of security descriptor information from the parent process.

After the Sleep loop we have an if that checks whether the second argument for the main function is zero. If you remember from the previous post this argument is an address (250000h in my case):

main4

The reason for this check is only understood once you look at the left branch of the function. On the left branch we have:

Anti-analysis:

The method “checks_file_name_volume_loaded_modules_and_registry” (see picture below):

  • Checks whether the file name contains the word “sample”
  • Checks whether the volume serial is 0CD1A40h or 70144646h (volume serials for sandboxes)
  • Checks if sbiedll.dll (sandboxie) or dbghelp.dll (Windows DbgHelp library) have been loaded in memory
  • Queries “System\CurrentControlSet\Services\Disk\Enum\0” for the name of the primary volume and checks if the name contains: “qemu”, “virtual”, “vmware” or “xen”

If any of the above conditions is met, the malware enters an everlasting sleep.

Privilege elevation:

The malware then tries to elevate privileges (shell_execute) using a trick that involves ShellExecuteEx with a verb “runas”. The user will be prompted with the typical UAC box to authorise the elevation. This will only occur if the integrity of the process is below medium.

On “spawns_new_process_and_replaces_it_with_this”, the malware spawns explorer.exe and the malicious code from this malware is loaded into it. This technique is typically called “Process Hollowing” where a non-malicious process is spawned in suspended state and its content is overwritten with malicious code that is then executed when it is set to active. The current process then exists. I am sure you will never look at your explorer.exe process the same way.

The last point should make the right branch more clear. The right branch is executed by the newly spawned process. The picture below depicts the last chunk peace of the main function.

Analysing the right branch

Due to the complexity of this malware, i will focus on points that i consider essential and describe the overall operation of the binary. Once the process hollowing is finished, the malicious code (now executing inside explorer.exe) runs the code on the right side of the branch. The malware beacons http://www.microsoft%5B.%5Dcom/ to check for network connectivity and if the number of bytes read is less than ten or the connection and subsequent request fail, the malware sleeps and then retries later. Once connectivity is confirmed, the malware creates a mutex named as (padded_volume_serial = volume_serial_padded_with_zeroes_to_eight_digits) with the following structure:

[MD5([computer_name]45386319[padded_volume_serial])][padded_volume_serial]

If the malware cannot create the mutex because it already exists (host already infected), the malware posts (before existing):

2015#[mutex_name]#22222#[windows_major_version].[windows_minor_version]#[service_pack_major_version]#[integrity_level]#10001#13#0

to the remote URL:

http://hsbc-auth-2%5B.%5Dru/smk/index.php

integrity_level is 1 if integrity is below medium or 0 otherwise.

The malware then exits. If, on the other hand it proceeds (mutex is successfully created), it tries to achieve persistence.

Persistence Mechanism

It attempts to resolve the string “%APPDATA%\Microsoft” and then “%TEMP%” ExpandEnvironmentStringsW failed for the former. The chosen folder will host a copy of the binary with a name resembling the following structure:

[a-z]{8}.exe

The [a-z] is generated from the first 8 bytes of the mutex name and by another encoding function which maps the bytes to a limited set of characters (a-z). The malware then checks the following keys:

  • HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Policies\Explorer\Run
  • HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Run

In order to understand this step, i am going to provide an example. Let us say the host has a software X installed which must be executed on startup. This can be done by putting the main software binary on Windows startup folder or by creating a subkey under one or both of the previous registry keys. As an example:

HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Policies\Explorer\Run\X

This subkey would contain as one of the values the path for the binary belonging to X software (e.g. X.exe). Malware typically leverages the first and/or second keys to achieve persistence across host boots. This malware attempts to create a subkey under the first key. If this attempt was not successful, the malware attempts to create a subkey under the second key.

In order to be more stealth, it picks the name of a subkey of an already persistent application and uses it (by checking any of the Run keys). If the malware fails to get any name from the Run keys, it picks an application name from:

“HKEY_CURRENT_USER\Software\”

This would still be stealth since this key contains the name of software installed on the machine. If the malware fails to get a name from this last key it uses the name “Microsoft One Drive” (who is going to suspect Microsoft?).

If the malware fails to set any of the Run keys, it creates a .lnk file on Windows startup folder for the binary that, as previously referred will be either on “%APPDATA%\Microsoft” or “%TEMP%”. The malware then copies itself to one of these directories and spawns a thread that keeps the malware persistent (i have seen this behaviour before when analysed an Andromeda downloader). The malware then spawns a thread to beacon the server and deal with commands from the latter. As a curiosity, the file copy timestamps (MACE) are changed to the ones of “advapi32.dll”. This mechanism of timestamps tampering is called timestomping and it is used to confuse analysts which don’t expect MACE timestamps to be as old as the ones belonging to standard files such as Windows dlls.

Client-Server Interactions

The function responsible for this part is depicted below:

I am going to start with “beacons_server_and_processes_requests” which is the core function used to interact with the remote server. The function “load_dll_in_memory_and_call_export” will be described later.

beacons_server_and_processes_requests

If you remember what i referred previously, the malware beacons one URL before exiting if there is an ongoing infection. Once the malware is executing properly, it cycles across URLs using the URL generator below (“generates_beacon_url” on IDA database):

domain_generator

The malware stores the URLs encoded. The data stored on 295020h is used as an index to choose the encoded URL.”encoder_decoder” is the function used to decode urls (called “url_encoder_decoder” on IDA database). A search for the address 295020h on IDA gives us:

domain_generator_seed

The address is firstly accessed on “executed_by_spawned_process” where it is initialised. Besides “generates_beacon_domain”, only “beacons_server_and_processes_command” accesses the address (at least using the address directly). The variable stored on this address is changed and (consequently) the URLs are cycled by this last function under certain conditions (more about this later).

Then we have the function “opens_file_and_decodes_it”:

This function attempts to open the file “[a-z]{8}[a-z]{8}” which is present on either “%TEMP%\” or “%APPDATA%\Microsoft\” according to where the binary was copied to. As you can see the binary reads 15 bytes starting at a non-zero offset inside the file.  This function returns (on eax) either zero (no file) or a pointer to an hex array containing the 15 bytes read from it. This 15 bytes will be part of the beacon sent to the remote server to get a command. I assume this is a means to identify the victim. Another option would be this array as a means to authenticate the malware. However, this would represent a weak authentication since on the first interaction the malware would have no means to authenticate itself (the file does not exists).

“beacons_server_and_processes_command” is then called. This function is the biggest on the whole malware and, as such, i will not post pictures of  it. I will go through the parts that i consider important. As said on the previous paragraph, the malware request a command by beaconing the server. The structure of the beacon is the following:

2015#[mutex_name]#22222#[windows_major_version].[windows_minor_version]#[service_pack_major_version]#[integrity_level]#10001#0#[15_bytes_hex_array]

Once the command is received, we have the typical if-else construction to process the command. The server response may contain a dll which is stored on either “%TEMP%\” or “%APPDATA%\Microsoft\” with the name “[a-z]{8}[a-z]{8}” (sounds familiar?). In Dofoil terminology this is called a plugin which will be later loaded when the function “load_dll_in_memory_and_call_export” is called. This function simply maps the dll in memory, decodes it and calls one of its exported functions (called “Work”). The MACE timestamps for the new file are also tampered to be the same as the ones belonging to “advapi32.dll”.

As for the supported features, besides the typical ExitProcess (server orders the malware to terminate process) and the deletion of the malicious binary (used together in this case), as well as the plugin functionality, the malware supports four features which are related to the file type embedded on the server response (yes, there is another binary).  If the server embeds an executable, the malware will (depending on another response field) take one of the following actions:

  • Write the executable to disk and execute it using CreateProcessInternal
  • Map the binary to an allocated region of memory and call its entrypoint

If on the other hand, the embedded file is a dll, the malware will perform one of the following actions:

  • Write the dll to disk and load it using LoadLibrary
  • Write the dll to disk and register the former as a global dll using the command “regsvr32 /s”
  • Copy the dll into an allocated region of memory and call its entrypoint

Files that are written to disk are written using the following procedure:

  1. Use GetTempPathW and GetTempFileNameW to create a temporary file.
  2. Delete the created file
  3. Use the fullpath of the temporary folder, append “.exe” or “.dll” according to the embedded file and create a new file.
  4. Write the binary (.dll or .exe) to the file.

I have also noticed that the server may send some flags to the binary to be used after the process launching feature is used. One of the flags tells the malware to exit while the other tells the malware to delete the file written to disk (the downloaded binary). I assume these mechanisms are used to update the malware using an approach similar to the following:

  1. New version is downloaded to Temp folder
  2. New version is spawned by the current active malware.
  3. Active malware either exits or just deletes file on temporary folder.

Between the two options, my guess is that it is a matter of leaving traces or not. When the system boots, the old malware will not be spawned again and the new sample is executed instead.

URL Cycling

Looking at the disassembled function “beacons_server_and_processes_command”, the address 295020h (dictates the beaconed URL) is incremented when:

  • The  creation of the first beacon (used to get the command) fails on wsprintf (number of bytes written is zero).
  • If the first character on the server response is ‘<‘
  • If the value of the first dword on server’s response has a value greater than the size of the data sent on the beacon (the formatted string i have referred on 1.). Mind that when ii refer the “size of the data sent on the beacon” i mean the number of bytes outputted by the function wsprintf which is used to create the beacon. I assume this first dword tells the malware the number of bytes received by the server. If the server receives less bytes than the ones sent, this represents an erratic behaviour. The URL is changed and the routine returns.
  • If the CRC computed on the downloaded data fails. I did not refer this CRC feature but the malware seems to use the current URL string as a means to check the integrity of the response using RtlComputeCrc32.

Final thoughts

Dofoil/Smoke Loader is a nasty piece of malware. While the host related activity is relatively simple to analyse, the structure  of the requests and responses is cumbersome. This makes sense since Dofoil is not a complete malware but simply a modular vessel capable of deploying new malware, inject into processes and map malicious binaries into memory (binaries sent by the server). The nasty functionality comes from downloaded binaries and/or plugins.

I have relied heavily on static analysis and selective execution of malware sections of code. My objective was to extract strings and determine the overall behaviour of the malware (e.g. modified registry keys, created files, beacon structures). The malware had lots of runtime checks and complex constructions. As such, selective execution of code is desired to avoid waiting for breakpoints to be hit. Signing out!

Stay safe 😉

Unpacker for Hire: 8ca23d7bdf520c3e7ac538c1ceb7b555

Hello paranoids

As i referred on my previous post, i have started a Reverse Engineering/Malware Analysis journey. One of the topics i find most interesting about malware is packing. A packer is an algorithm that manipulates a simple binary and adds one or more of the following features (this is not an exhaustive list):

  • cryptors and/or compressors: e.g. to hide internal strings/shellcode
  • anti-debugging: e.g. to detect if the malware is being debugged
  • anti-vm: e.g. to detect if the malware is running inside a Virtual Machine
  • anti-dumping: e.g. to prevent analysts from dumping a malware by destroying binary headers/code.

Packed malware has, typically, a small amount of imports (such as LoadLibrary and GetProcAddress) which are used to resolve dependencies in runtime. The process by which a packed malware resolves its dependencies and restores its own malicious payload is called unpacking.

Enough 101. I am starting a series of posts called “Unpacker for Hire: [Malware MD5]” on which i will go through the process of unpacking some malware samples i find in the wild. This will be the first post of that series. Depending on feedback, the posts may lose or gain more detail. Unpacking does not require a massive post if i do it blindly but, if i want to understand what is going on, that requires deeper analysis and more time.

For the more experienced: you will not see me using fancy tools/plugins for the first set of posts since, for beginners, it is best to use as little automation as possible in order to understand the unpacking patterns and tricks.

I have recently discovered that the names you give to the pictures you upload appear when you hover your mouse. Some of the names i provided on my laptop are a bit off so please ignore those and stick to the picture descriptions. I also can’t seem to enforce different sizes for the headers of each section. Any word of advice for this poor noob is more than welcome.

I will describe the analysis process of the unpacked malware on a future  post and i may come back and edit this post to reflect my findings. I am going to split this analysis in four stages since the unpacking process is done in that manner.

Characteristics

MD5: 8ca23d7bdf520c3e7ac538c1ceb7b555
Family name: DoFoil a.k.a Smoke Loader
Packing algorithm: custom

Analysis Environment

Tools: OllyDbg + OllyDump, IDA Pro
Environment: VMware with 64-bit Windows 7

First stage

I always start by opening the malware with IDA and check the imports:

IDA view: malware imports

IDA view: malware imports

Does not look packed but it also does not look too harmful either. Some Certificate manipulation, file enumeration, registry manipulation. Let us look at the entrypoint:

IDA view: entrypoint for malware

IDA view: entrypoint for malware

The LocalFileTimeToFileTime and lstrcmp are omnipresent across the packed sample and appear to have no real usage (garbage code). IDA did not decode the chunks of bytes you see. loc_405B25 is called if the lower word for the stack pointer is lower or equal to 0xFE00. At the time of writing i am still not sure what is the reason for this verification. I would guess the lower word for the stack address varies across Windows versions, which would allow for a simple verification of the OS version. If you know what is this verification, please, feel free to drop a comment below. If i find out in the future, i will let you know. loc_405B25 is depicted below:

Call to loc_405B25

Call to loc_405B25

This pattern is repeated over and over again (like a Russian doll) with slight changes on some instructions. This seemed like a dead end. However, if the condition is not verified, sub_405065 is called. This happened on the test environment and on a Windows XP. I did not use any plugins or special configurations to avoid anti-analysis techniques so i assume this first verification is not such.

The function sub_405065 contains code used to load libraries. The dlls and functions to be imported and resolved are slightly “obfuscated”. For instance, an ‘a’ was prepended to the hardcoded string sycfilt.dll before calling LoadLibrary to load asycfilt.dll. Failing to load a given dll would lead to a jump to tiny routine that would set the ebp to the return of LoadLibrary (zero since it failed) and then return, which would crash the process later (stack cannot be based on 0x0 address). I have called failover to this tiny function but i later realized that this is not a failover mechanism (it does not recover).

asycfilt.dll, according to the table of exports, seems like a pretty useless dll to load and the next loaded dll (attempted load), gsycfilt.dll is even more strange since it does not appear to exist. For the latter, the failover function is called if the dll is successfully loaded. My guess is that the existence of the dll is a sign of infection and it signals the malware for the fact that the computer is already infected (just guessing). asycfilt.dll does not appear to be used later. My guess once more is that this dll is present only on some versions of Windows, being a means to test the version of the OS.

Next, the malware resolves ReadProcessMemory (after loading kernel32.dll) by deobfuscating wwwProcessMemory with some easy but geeky arithmetics. The malware then proceeds to get the relocation table address and size for asycfilt.dll and checks the size against an interval of possible sizes (first figure below). The address of VirtualAlloc is then resolved. At the top of the middle figure you can see a fancy call to VirtualAlloc with some arithmetics to hide its arguments. Obfuscation aside, the call is

VirtualAlloc(Null, 1672,MEM_COMMIT,PAGE_EXECUTE_READ_WRITE)

The last parameter can’t provide a more obvious conclusion: the malware is about to unpack some shellcode and write it to that section. The loop you see below the VirtualAlloc call is exactly that. The malware is looping across dwords, starting on 0x408E44 and applying a bunch of operations. I will not delve into this since the operations are straight forward and add no meaning to this post. At last (last picture), LoadLibrary address is pushed onto the stack and the execution jumps to the beginning of the decoded shellcode.

At this point, i strongly advise you to take snapshot of the VM because, an anti-VM/debugging technique on the next code and you have to redo everything. I like to look at IDA while i am debugging so, i need to dump the contents of the allocated section. Once you are at the beginning of the allocated section, go to memory view and double click the beginning of the section containing the code. Select the bytes -> righ-click -> Backup -> Save data to file. Open the .mem file with IDA in Binary mode.

OllyDBG view: dumping malicious section

OllyDBG view: dumping malicious section

Second stage

The shellcode uses LoadLibrary to get a handler for kernel32.dll. The name of the library is hardcoded but IDA interprets it as code at first. Below you can see part of the shellcode code. On the left is the interpretation of IDA, on the right is “my interpretation”.

The same behaviour is observed a few lines below:

The usage for these chunks becomes clear when they are looped through and passed on to the function below:

IDA view: code for resolution of malware dependencies

IDA view: code for resolution of shellcode dependencies

I have commented everything so you can understand what happens. In order to reach these conclusions, i advise you to open kernel32.dll on IDA pro and rebase it to be on the same address as in the debugger. From then on you must leverage IDA (with kernel32.dll + the code above open) as well as OllyDBG to reach my conclusions. Simply put, the malware loops through kernel32.dll exported names, computes a checksum for the strings representing those names (green rectangle depicts the algorithm) and compares the checksum with the one provided and hardcoded. Once it finds a match, it gets the ordinal for that function and uses it to index the exported functions table to get the address of that function. The mov before popa puts esi on eax (previously saved using pusha). Then, the location containing the hardcoded checksum is overwritten with the function address (stored on eax). The malware resolves the addresses for the following functions:

  • LoadLibraryA
  • GetModuleHandleA
  • GetProcAddress
  • VirtualProtect
  • VirtualAlloc
  • VirtualFree
  • CloseHandle
  • CreateToolhelp32Snapshot
  • GetModuleFileName
  • CreateFileA
  • SetFilePointer
  • ReadFile
  • GetCurrentProcessId
  • Module32First
  • Module32Next
  • GetProcessHeap
  • WaitForSingleObject

Once the resolution is performed, the malware checks the preamble for ReadFile function against 8Bh (mov?). Then, it proceeds to resolve some more functions (the names for those are hardcoded). We have an ending similar to the first stage: allocation of RWE memory and copy of shellcode to that address. The execution then jumps to that code:

IDA view: copying shellcode to allocated memory

IDA view: copying shellcode to allocated memory

We are done here. Time for VM snapshot.

Third stage

The third stage is a bit long. The new shellcode leverages some of the resolved functions to replace a big chunk of the malware binary loaded at first with the malware to be executed on the last stage.  The picture below on the left shows the first chunk of the shellcode obtained from the second stage. As you can see, the shellcode obtains a pointer for a memory location inside the .text section of the loaded binary. Some of the data in that location copied and decoded. The right picture shows the decoded contents.

Yes, the malware contained a binary inside itself. You can dump that binary using the process i described previously and open it with IDA. It is a well formed binary and IDA will not complain. IDA recognises the start function but nothing else. That start function is part of the last stage of the unpacking process.

The shellcode then proceeds to overwrite the necessary sections with the contents of the sections in the embedded binary (only .text is overwritten) and header adjustments are performed to comply with the embedded binary specifications. I will not delve into a thorough explanation of what is happening. The pictures below should suffice. The third stage is mostly overwriting shellcode with shellcode and make sure nothing blows.

The last jmp is used to jump to the entrypoint of the embedded executable, now part of the old malware .text section. If you have OllyDump, you will be able to dump the process easily once the EIP is at the beginning of the newly-decoded shellcode. IDA will accept the binary without complaining. Whoever wrote all this code deserves the slow clap.

Mind that, even though this is seems like a standalone executable, do not try to run it by itself since it has memory references that were adjusted by the unpacking algorithm. In my case, if the new executable is loaded at an address that is different from 0x00400000, the process will crash. The high degree of dependencies accross shellcode sections makes the analysis of this sample pretty cumbersome.

Fourth stage

We have reached the final stage of unpacking. OllyDbg won’t show you meaningful instructions when you do the Right-click->Analysis->Analyse code trick. As such, you will have to use IDA and rebase the code at the same address as the malware in memory. In my case, the malware image base is on 0x00400000. The first chunk of code is just a routine that xors an encoded segment of code with 0x313B6535. Then, the malware calls that decoded segment. See the pictures below:

You could either decode the shellcode with IDA scripting or let the malware do it itself and then dump the process again.  The shellcode is now recognisable by OllyDBG if you use the code analysis trick.  The shellcode that is executed next is quite interesting. First, it leverages IsBeingDebugged flag and NtGlobalFlags to detect debugger activity. If you have anti-anti-debugging plugins, you should be ok to jump directly to the end. Otherwise, i recommend, either patching or setting breakpoints on the lines with calls and setting the eax registry to zero (no debugger).

IDA view: anti-debugging techniques

IDA view: anti-debugging techniques

Once you pass this phase you may now ask. Are we there yet? Well…no. Another chunk of code separates us from the real deal. This malware has a tricky MO. It contains tons of shellcode that is unpacked in runtime and jumps from memory section to memory section. During this last part, the malware starts by unpacking a chunk of shellcode to an allocated section of memory (wow!). Part of that shellcode contains the unpacked payload that we want. It also contains parameters to set up the latter properly. Let use see some IDA, shall we?

IDA view: beginning of shellcode

IDA view: beginning of shellcode

This is the first part of the unpacked shellcode. ebx is still fs:[30h]. As such, the last line appears to mean: fs:[30h]->PEB_LDR_DATA->InLoadOrder->SizeOfImage + fs:[30h]. Drop a comment below if you know what is happening here. Then we have the called function:

From top to bottom, left to right:

1.The running shellcode performs a first decoding of another chunk of shellcode using xor operation with key 0x313B6535. The decoding is in place.

2.A second decoding is performed and the contents of the shellcode are copied to an allocated memory section. Another memory allocation is performed which will hold the unpacked payload that we want. The unpacked payload is copied from the previously referred shellcode section.

3 and 4. The unpacked payload memory references are adjusted relative to the beginning of the binary in memory. Something tells me that the binary will be referenced later. Then, the malware jumps to the payload. Before that, it deletes all the code previously described as can be seen below.

OllyDBG view: memory state before jumping to malicious payload

OllyDBG view: memory state before jumping to malicious payload

Before jumping to the final shellcode, the shellcode pushes three values on the stack: a pointer to a string “22222” inside the loaded binary memory space, a pointer to the beginning of the memory section containing part of the old shellcode and zero. Before jumping or at the beginning of the new shellcode, i recommend you make another snapshot. The real analysis starts now but it will not be part of this article since most of you may be lost or asleep by now.

What’s next?

At this point the malware is unpacked and will execute the main payload. However, it will not make it easy on you. There is no point in using OllyDump to dump the binary because the code is in another section. You will have to dump the code using the procedure i described before. However, you should not dump at the beginning of the code. Take a look at the pictures below:

As you can see, the malware performs the address resolutions for some functions. The resolution appears to be similar to what we have seen before: computing a checksum on the name of the functions that are part of the dlls in memory and compare that checksum with an hardcoded one. Once you run this function, you can dump the memory section containing the shellcode.

If you use just a standalone debugger or if you use IDA with one of its debuggers, you can skip this paragraph. If, as me, you switch between IDA and OllyDBG, or use another combination of disassembler and debugger, you will have to rename the function offset names to something meaningful. The way i see it, you can do this in one of two ways:

  • Go through the offsets table and for each offset go to (in OllyDBG) View->Executable modules. Find out where is that offset in the list of dlls, right-click the dll->View names. Search for the offset and you have the name. Change the name given by IDA to that offset. Now rinse and repeat.
  • Find the name of the function using the procedure on 1. Create an IDA script that goes through the table and renames the offsets to reflect your findings.

I intend to make another post for the analysis of the unpacked malware on another post but for now, we end here.

Final thoughts

If you are familiar with WinDBG, you know that i could have suffered less if i had used it. This happens because WinDBG is able to show you OS internal structures and offsets which would have helped a lot in this case. I have not used it because i am more used to ImmunityDBG or OllyDBG but i hope to change that soon.

Some of the unpacking steps could have been skipped and the article would have been shorter. However, it is important to be comfortable about all these layers of unpacking. The analysis of this malware is far from trivial since it has a bunch of shellcode and i may have left you pretty confused. Also, the IDA views i posted with comments are meant to help you understand what is going on, assuming you understand my comments. Feel free to leave some feedback about the overall structure and contents.

Stay safe 😉