63 Problems But Malware Ain’t One: 882aef202a56008ad20a61c8960eb830

Hello paranoids

 As promised i am back to reverse stuff. This time, and following the previous sequence of posts, i have decided to pick an Android malware for analysis. Without further ado, let us begin.

Malware Characteristics

MD5: 882aef202a56008ad20a61c8960eb830
Family name: Ginmaster (GingerMaster)
Obfuscation/Packing: Yes/No

 GingerMaster was the first Android malware using GingerBreak exploit, an exploit that affects GingerBread (Android 2.3). The malware is capable of downloading, installing and launching APKs without users permission using low level tools like pm (package manager), sh (shell) and am (activity manager). The malware has their classes’ and functions’ names obfuscated using small chains of random characters. Contacted URLs and other relevant strings are in plaintext.

 As always, i have uploaded the IDA database to my GitHub repository.

Analysis Environment

Tools: Android Studio (2.3.2) and sdk tools (e.g. adb, aapt), IDA Pro 6.4.130111-32-bit, JD-GUI 1.4.0, aapt, dex2-jar, apk tool,
Environment: VMware with 64-bit Windows 7
Emulator and Libraries: Nexus 5X, Gingerbread (2.3.3 with GoogleAPIs, API Level 10),

Objectives

The objective of this post is to understand how the malware works, i.e.:

  • Files created
  • URLs contacted
  • Exfiltrated information
  • Services created and what they do
  • Usage of GingerBreak, an exploit used to obtain root on Android 2.3 (Gingerbread)

Preliminary Analysis

 Before diving into tools and code, it is worth checking the activity generated by the malware when executing. We start by creating a emulator and using adb to deploy the application. I would like to see how the application works and the traffic it generates (e.g. using Wireshark). It is cumbersome to track the source of the traffic (i.e. malicious application vs. legitimate emulator behaviour).

 client[.]mustmobile[.]com is a domain widely used by the malware and since there is no DNS resolution for it, no further traffic will be seen. As for host indicators, the Android Device Manager is of great help:

SDCardLogCat

 The first picture is contents of the SD card. Since i already know that the malware uses the SD card to store files and perform the exploitation we are expected to see files there. install.sh, installsoft.sh, runme.sh and gbfm.sh are of utter importance.  It also uses SQLLite to store configurations and lists of packages for download (databases folder).

 The malware writes far too much on the logs and Logcat (integrated on Android Studio) is useful to get them. As you can see GameService has been spawned (picture below). The command:

adb shell dumpsys activity services

is helpful to get a dump of the activities and services. We can see there that GameService is running and Web has bound to it (refer to service binding for more information).

Package Structure Analysis

I will assume you read the previous articles and that you are familiar with the usage of the tools i referred there.

As soon as you start looking at the application package you will realise it does not have a lot to look at. Once you convert the dex file to a jar, you will see about 106 classes under the same package com.igamepower.appmaster. You can spot some degree of obfuscation by noticing:

  • random name classes (e.g. af.class, e.class, f.class)
  • mix of Java and smali code. This is indicative that the conversion from dex to Java failed due to anti-reversing techniques

JavaAndSmali

 While the number of classes is high, the absence of multiple packages makes the analysis easier. The first version of Ginmaster i looked at (MD5 17ca4c91367ba4b91bdb6e0b77aaafb6) while having many more packages, was successfully decompiled into Java. I think for learning purposes, analysis of SMALI is a much more interesting exercise.

Looking at the decoded Manifest we can see the following permissions:

Permission String Permission (according to docs)
READ_PHONE_STATE
  • Phone number
  • Cellular network information
  • Status of ongoing calls
READ_LOGS  Allows application to read low-level logs.
ACCESS_CACHE_FILESYSTEM
DELETE_CACHE_FILES
 Self-explanatory.
WRITE_SECURE_SETTINGS
  • ADB Status
  • WIFI
  • Parental Control
ACCESS_NETWORK_STATE  Query network information (e.g. check if connected).
INTERNET Allows the creation of sockets.
WRITE_EXTERNAL_STORAGE  Self-explanatory.
MOUNT_UNMOUNT_FILESYSTEMS Self-explanatory. The malware requires access to external SD cards.
READ_OWNER_DATA
WRITE_OWNER_DATA
User email, name, etc
WRITE_SETTINGS  Allows the modification device settings
INSTALL_SHORTCUT
UNINSTALL_SHORTCUT
 Allows the creation of a shortcut on Launcher (the board with applications)
RECEIVE_BOOT_COMPLETED  Allows application to receive a notification when the device finishes booting
RESTART_PACKAGES Allows the app to end background processes of other apps.

In terms of string resources we have little to none and they are in chinese. Feel free to use Google translate to check the translation.

In terms of activities/services/receivers, we have:

Type Name Intent Filter(s)
Activity Myhall action:MAIN
category:DEFAULT
Activity HomeActivity  
Activity SortActivity1 action:MAIN
Activity SortActivity2 action:MAIN
Activity SearchActivity action:MAIN
Activity ManagerActivity action:MAIN
Activity GameInfo action:MAIN
Activity TableClass action:MAIN
Activity Web action:MAIN
category:LAUNCHER
Activity GameAlertDialog  
Activity TestView action:MAIN
Service GameService action:MAIN
category:LAUNCHER
Receiver GameBootReceiver action:BOOT_COMPLETED
Activity DevelopmentSettings (not found on the package) action:APPLICATION_DEVELOPMENT_SETTINGS

 The usage of the intent filters you see for GameService is not clear to me.  MAIN and LAUNCHER are associated with the activity responsible for the first screen. Services cannot be launched directly through Launcher icons. Upon looking at the Java code for Web, i have seen the service GameService being explicitly launched as i will refer on the next section. I am assuming this was a mistake of the authors that turned out not to crash the application.

 The usage of DEFAULT category on Myhall is intended to mark that activity as a potential candidate to receive implicit intents for the MAIN action. i.e, if another application or even this one creates an intent with the action MAIN, all applications with a MAIN action and the DEFAULT  category will receive the intent.

 GameBootReceiver is a broadcast receiver. Broadcast receivers are used by applications to receive broadcasted intents from other applications (e.g. OS reporting battery status). In this case, BOOT_COMPLETED is an action used on broadcasts to notify applications of a complete system boot (once it is finished).

 Finally, DevelopmentSettings has APPLICATION_DEVELOPMENT_SETTINGS action which tells us the class is able to mess with application development settings on Android. This activity was not on the APK.

    I will not overview the assets folder just now because i am referring it later. Enough high-level analysis, let us dive into details.

Java SMALI Code Analysis

 Unfortunately, this is one of the cases where the analysis of Java will confuse you more than actually help. We need to look at SMALI. We can use IDA to disassemble the .dex file. As previously referred the first Activity to be launched is Web. It starts by collecting device information and some APK details:

  • IMSI
  • SIM serial number
  • Phone number
  • Network type
  • CPU serial (ignored and set to IMSI if CPU serial is 0000000000000000)
  • Pixels (width, height)
  • Version code of the APK
  • Service channel 1004 (stored as a resource on the APK)
  • Current Time

 These details are posted to the URL we have seen before: http://client%5B.%5Dmustmobile%5B.%5Dcom/mt.php. Another URL (http://client%5B.%5Dmustmobile%5B.%5Dcom/request/update.do) is beaconed with the same details  and the response after parsed is written to /data/data/com.igamepower.appmaster/files/cache/igamepower_file/8888. The purpose of this file is not clear since it is not opened anywhere else. However, the processed response is passed to ac handler with code 0x1 which leads to the download of a package named com.igamepower.appmaster of the version of the current malware is lower than the one hosted on the server (malware update):

CheckPackageVersionDownloadUpdatedVersion

 Web class is also responsible for spawning GameService service. This service launches a couple of threads to contact the URLs on the following table. The aq thread is responsible for performing the requests while the an handler processes them based on codes.  All beacons contain at least the collected data i have previously referred. All the URLs are for resources hosted on http://client.mustmobile%5B.%5Dcom.

Resource Data Sent and Purpose Handler Code and Processing
report/first_run.do Data Sent: Sends device details only.

Purpose: Beaconed when the application runs for the first time.

Code: 0x1 (default case).

Processing: Nothing is done.

report/uninstall_success.do
report/install_success.do
Data Sent: Package Information.

Purpose: Beaconed when packages are installed or removed. Information about those packages is sent (e.g. name).

Code: 0x0 (default case).

Processing: Nothing is done.

request/config.do Data Sent: Key-value pair action:config.

Purpose: Updates configuration parameters.

Code: 0x3EA.

Processing: Shared Preferences are updated with fields such as:

  • get_list_limit
  • get_config_limit
  • get_list_limit
  • server_domain
request/push.do Data Sent: Sends SharedPreferences field soft_last_id.

Purpose: Informs the server of the last id associated with the last software on the downloaded lists of software.

Code: 0x3EC.

Processing: Seems to be used to pull a list of software. The downloaded metadata is converted into shortcuts that when clicked redirect to the malware activities that display information about them.

report/install_list.do Data Sent: Result of “select * from game_package where status=1” is sent. status is set to 1 when there is a package installation.

Purpose: Likely to inform the remote server of the list of packages installed by the malware.

Code: 0x1 (default case).

Processing: Nothing is done.

request/alert.do Data Sent: Sends to the server a field from SharedPreferences: alert_last_id.

Purpose: This seems like a means to pull notifications.

Code: 0x3EB.

Processing: A notification is shown through GameAlertDialog.

 It is GameService which deploys GameBootReceiver. GameBootReceiver is responsible for launching GameService upon boot. What is puzzling about this malware is that onReceive implementation has code to deal with installed and removed packages (i.e. Manifest permissions PACKAGE_ADDED and PACKAGE_REMOVED). However, the manifest states that this receiver is only able to BOOT_COMPLETED.

 More network activity may be generated by MyHall. MyHall is a TabActivity which creates multiple tabs within itself that when clicked launch one of the following activities:

  • HomeActivity: An activity and a Thread. The thread component is launched within the onCreate method.  This thread beacons http://client.mustmobile%5B.%5Dcom/request/index.do and and appears to be used to pull a list of applications to be displayed on the HomeActivity activity (check bv.a with code 0x3E8).
  • SearchActivity: The user can use this one to search for applications. The queries are posted to http://client.mustmobile%5B.%5Dcom/client.php?action=softlist&type=search&word=.
  • ManagerActivity: Displays packages installed on the phone and shows more to be installed. It also allows the packages to be launched and deleted directly from the activity. 

 GameInfo queries http://client.mustmobile%5B.%5Dcom/client.php?action=soft&soft_id= to get information about a given application to be displayed to the user. SortActivity2 uses the URL http://client.mustmobile%5B.%5Dcom/client.php?action=softlist to get a list of software.

 There is a URL http://apk.mustmobile%5B.%5Dcom/apk/20110705/19225910801.apk that appears on TestView.onClick:

Baidu

 According to OSINT, bdmobile.android.app is Baidu. When a certain button (identified by 0x7f0b0061) on this TestView is clicked, bdmobile.android.app is downloaded and deployed. What puzzles me here is that the activity is not created by any class within the APK so, apparently, there is no way for this branch to be executed.

 So far we have overviewed network indicators and some host indicators (i.e. files created). We now know what details are beaconed to the server. GingerMaster has an interesting feature that we have not overviewed yet.

GingerBreak Exploit

 GingerMaster was the first malware leveraging GingerBreak exploit to obtain root permissions on the host. GingerBreak affects Android 2.3 Gingerbread. For it to work, the device must have an SD card inserted and USB debugging must be enabled which explains what i am about to overview. Going back to GameService.onCreate routine we see:

Multiple png files are moved from the assets folder to /data/data/com.igamepower.appmaster/files/ with .sh extensions. Then, the following command is executed:

chmod 775 /data/data/com.igamepower.appmaster/files/gbfm.sh /data/data/com.igamepower.appmaster/files/install.sh /data/data/com.igamepower.appmaster/files/installsoft.sh /data/data/com.igamepower.appmaster/files/runme.sh

When looking at the assets folder inside the APK, we can see four PNG files:

File Name MD5 Type Functionality
gbfm.png fa355f01ec16bcc09fa0a2341f0ceb40 ELF GingerBreak Exploit
install.png 725bee6d16deb8eb0f4e869fa412a71b Bash Script

Basically moves /system/bin/sh around.

installsoft.png 25bcbf1d0a3297c8b93e3999aa750974 Bash Script

Install APK passed as argument.

runme.png 3674d33c271a0c3c8f06c6ff7276e2b8 ELF  /system/bin/sh (see below)

runme.png is a very simple ELF program (fancy some ARM?):

runme 

 As far as i understand, the malware uses GingerBreak to install APKs on the host without requiring a specific permission on the Manifest or PlayStore using low level tools such as sh, am and pm. If the malware updates itself, it also leverages the exploit to escalate privileges.

 Three classes pay a vital role on the deployment of APKs:

GameService: More specifically the a method. This method is used to download and deploy the APKs using f thread. Below you can see the functions that call a (called DownloadsAndDeploysAPKs on the diagram):

DownloadsAndDeploysAPKs

Downloads are therefore performed as a consequence of the events:

  • TestView.onClick: As previously referred, this seems to download Baidu. However, i see no traces of TestView being used on the program.
  • aa.handleMessage: Associated with the update of the application when http://client%5B.%5Dmustmobile%5B.%5Dcom/request/update.do is beaconed. aa is launched from Myhall.onCreate.
  • ao.onClick: Likely associated with the request for an application made on the GameInfo activity, i.e., user checks application information and clicks button to download.
  • bg.onClick: Used on SortActivity2 which displays lists of applications. Likely similar to the previous event.
  • ca.handleMessage: Associated with the update of the application when http://client%5B.%5Dmustmobile%5B.%5Dcom/request/update.do is beaconed. ca is launched from Web.onCreate.
  • cb.onClick: Associated with packages installed from ManagerActivity.

cj: Thread spawned on GameService.onCreate. Its purpose is to trick the user into enabling USB debugging (if not enabled) by showing notifications and launching Application Development Settings. It also checks whether there is an external SD card (remember this?). Once the debugging is enabled, the exploit is launched using the routine e from cj. gbfm.sh is executed. Once the exploit is finished, install.sh is also executed. Other functions within cj, such as c and are worth mentioning because they are related to the execution of the scripts.

f: This thread is responsible for performing the download of the APK, the writing and the installation using installsoft.sh. Once the APK is downloaded, the thread checks for the ROOT status. ROOT_STATE_GINGER and ROOT_STATE_PERFECT are processed similarly as can be seen below:

ROOT_STATE_SU is processed with:

 The code and therefore the purpose is similar (deploy package using/data/data/com.igamepower.appmaster/files/installsoft.sh). What changes is the interpreters used:

  • ROOT_STATE_PERFECT: /system/xbin/appmaster/sh
  • ROOT_STATE_GINGER: /data/data/com.igamepower.appmaster/sh
  • ROOT_STATE_SU: /system/bin/su -c

  ROOT_STATE_SU indicates that the application is already running as root and, therefore, it only needs to use the standard /system/bin/suROOT_STATE_PERFECT is set when the system is rooted and when the  file /system/xbin/appmaster/sh is readable (install.sh executed completely). If the install.sh fails to create /system/xbin/appmaster/sh, ROOT_STATE_GINGER is the root state.  The reason for the malware copying /system/bin/sh to /data/data/com.igamepower.appmaster/files/ and /system/xbin/appmaster is not clear to me.

Final Notes

 Since the only issue you have to deal with is obfuscated classes, you can analyse this malware without stepping through the code. Also, since the domains used by the malware are already down, you would not be able to see anything meaningful. The malware pulls lots of configurations and without them you are left with a malfunctioning state machine. Analysis of logs using Logcat can still be useful to understand what activities and services are launched for this specific malware since it uses the Logging capabilities a lot.

 Even though this was my first attempt at reversing Android malware, i have felt that reading SMALI is much simpler than reading x86 which is why i have relied solely on static analysis and a bit of dynamic analysis by using an emulator. As far as i know, it is not possible to change the instruction pointer and execute selective chunks of code as you can do on x86. We can however take small chunks of SMALI, compile them and execute them.  

Stay safe 😉

Android Reversing Part 3: Tampering with Android Applications

Hello paranoids

 So, after all those theory-related posts, it is time to actually do something. On this post, i will tamper with a simple application for Android. Let us begin:

The Test Application

 As referred on the post about tools there is a website from which you can download APKs, APKMirror. When i first wrote these lines was trying to tamper with a Facebook application (MD5 f96435e9a1c951ed181fb02dd61c4b2c)  downloaded from APKMirror.  However, when i looked at the smali and decompiled sources i was like:

jackie_chan

 The application was protected with Proguard which for a noob as myself is way above my paygrade (for now). So i have dug out my weak Android expertise and put together a simple application (my GitHub) with a weak authentication (hardcoded credentials). The specifications for the emulator are:

  • Android 6.0
  • Marshmallow, API 23
  • x86

Preparing  the Environment

 The development environment i used was Android Studio 2.3 (with SDK tools). All the coding and reversing was done within a VM using VMWare Fusion Professional Version 8.5.6 (5234762). Don’t forget to enable Intel VT-x/EPT support since emulators require this. While this is not a development blogue i will leave some considerations here for future reference and in case someone gets stuck.

Creating an Emulator and Deploying the Application

The easiest way i could find to do this was to create a dummy project on Android Studio (2.3) and then create the emulator. In terms of environment variables i have defined:

ANDROID_SDK_ROOT -> C:\Users\[YOUR_USERNAME]\AppData\Local\Android\sdk

Assuming you are at the window of the project, you should see a button with “AVD Manager”.

  1. Click the icon
  2. Create Virtual Device
  3. You can choose any device. Choices only matter if you need the application to run with a good resolution (AFAIK).
  4. On the next screen you should be prompted to choose the Android image. You may need to click “download” to obtain the image. In this case i am choosing Marshmallow, API 23 for x86.
  5. Then you are prompted to choose a name for the emulator. I will use “test” and i am leaving everything as default (i.e. just put a name and click “Finish”)
  6. Now you should see a list of the devices you have configured. Click on the green arrow to launch the emulator.

If everything goes smooth, you should see the emulator. Also, use adb to check for active devices:

 adb devices -l

You may see:

ADB Device Listing

 If this is the case, you will not be able to deploy the application. The workaround for this is going to (on the emulator):

  1. Enable developer options (Settings->About.. and tap 7 times on the screen)
  2. Go to “Developer Options”-> “Revoke USB debugging authorisations”
  3. Kill the adb server with “adb kill-server” and then restart it with “adb start-server”
  4. Then enable and disable debugging. You should see a prompt to authorise USB debugging. Authorise it. When you list devices you should now see:
Correct Devices List

Correct Devices List

Now, the application can be deployed with (-g grants all runtime permissions):

adb -s emulator-5554 install -g [PATH_TO_APK]\App.apk

 You should see “Success” as the last message. If you now check the emulator should be on the list of applications.

The Base Application

The application is a simple interface that takes a username and a password as input and tests the parameters against internal values:

 

What’s that? I have mad Android Skills? Well, thank you. In order to make this more realistic, i will sign the application. For this you need JavaDK. We first generate the keystore and the key (password legitlegit):

 keytool -genkey -v -keystore hackme_testapp_legit.p12 -alias hackme_testapp_legit -keyalg RSA -keysize 2048 -storetype pkcs12 -validity 365

 Once you fill in some details about your “company” you will have a keystore on your desktop containing a certificate valid for a year (365 days). The alias for the key in this case is hackme_testapp_legit. In order to create a release APK, go to (on Android Studio): Build->Generate Signed APK. Fill in the information about the keystore and click Next. On the last screen, mark V1 (Jar Signature) and V2 (Ful APK Signature). Build type should be release. Click Finish. Your APK is built and should be on the application folder with the name [application name]-release.apk.

 Once more, in order to make things more realistic, we can generate another keystore and key for the rogue apk. Same command as before but you can call both the keystore and the alias hackme_testapp_evil.p12 and hackme_testapp_evil (password evilevil).

Changing the Application

So, what do i want to do here?
 
  1. I want to change used images and strings (resource manipulation)
  2. Find the credentials within the decompiled code (Java) and smali code
  3. Patch the application so we can have a successful login regardless of the password
  4. Demonstrate how to debug the application
 
 
Why all this? Well, this is a basic application and my objective is to show you how to use the tools i have described on {first article}.
 
 

Getting the “original” application

First, we need to use (from dex2jar scripts):
 
d2j-dex2jar.bat app-release.apk -o app-release.jar
 
 
to get a jar containing the .class files belonging to the application. Now, we need to get the resources:
 
apktool d app-release.apk -o app-release
 
 

Manipulating Resources

 Strings displayed on Android activities are classified as resources. As such, they can be found inside the res folder. If you use Windows search within the res folder you will find the strings on res\values\strings.xml. Let us change “HackMe” to “Hacked” and  “Successful Login” to “Ain’t nobody got time fo failure”. As for the images, they may be on mipmap-something or drawable-something folders, or both. The first are typically for launcher icons while the second are for activity images. Since i suck at Android, i ended up using launcher images for activity images so you will find everything on mipmap folders. No science here, just replace what you want with what you want but keep in mind the names should be the same. Also, for mipmaps, png images are used so, adjust your potential JPEGS. I am disregarding image ratios here with the argument “what is the worst that can happen?”. We have pimped the application. I will rebuild the application once i am done with the next part.

Find the Password and Disrupting Workflow

 So now we need to find the credentials. We can either look at the decompiled code (i.e. Java) or smali. Since this is a small and application leveraging to anti-reversing framework, it is peace of cake. Open JD-GUI and drag-and-drop the Jar we generated previously:

Decompiled packages

Decompiled packages

 In Android terms, the main application should be a subfolder within com. This happens because, the terminology to fully qualify an Android application package is [reversed company domain].[application name]. In this case, the company domain would be hackme[.]com and the application would be testapp.

As you can see, and if you click through those class files, only three files are of interest:

  1. HackMeMain.class
  2. SuccessLogin.class
  3. FailureLogin.class

 These files contain classes that inherit from AppCompatActivity and given the fact that we have three screens on the application, this is highly indicative that each of those files represent the Activities for the screens, success screen and failure screen, respectively.  Also, according to the decoded Manifest.xml file, there are indeed three activities. Let us look at the HackMeMain.class file:

HackMe credentials' check

HackMe credentials’ check

 Whenever we click on the Login button, we need an event listener to respond to this. That is what is happening within the setOnClickListener function. Within that function there is a definition of an anonymous class with a single function onClick. There is also an if statement checking the the contents of two EditText views. As a side note, JD-GUI was unable to decompile properly the name of both fields. They should be username and password as opposed to localEditText1 and localEditText2. This is acceptable since during compilation, some data is lost for multiple reasons (e.g. optimisation). So we have two string checks. agains “ADOGETORULETHEMALL” and “VERYZICRET” (username and password respectively). These are the credentials so we have another objective accomplished. Now, we want to patch the application to ignore the verification of the credentials and show the success screen all the time. We can do this in one of two ways:
  • Recreating the application:
    • Use JD-GUI to export the Java files (File->Save All Sources)
    • Create a dummy Android project with Android Studio and then replace all the resources, manifest, packages and code with the ones we managed to obtain.
    • Modify the Java code
    • Use Android studio do create and sign a new APK
  • Modifying the smali code and rebuilding the application

 The first approach is more time consuming and should be avoided if the smali is readable. If you look within the folder created by the apktool, you will see a smali folder. This folder contains the smali code associated with the application. We can look at smali using IDA Pro but, you must provide the .dex file. You can obtain this by opening the original APK with with an archive browser (e.g. Winrar, 7-Zip) and pulling off the .dex file. You can also open the APK file with IDA (with Administrator privileges) and choose the .dex file. When looking at IDA, you have to:

    1. Search for the methods starting with the class name you want: HackMeMain
    2. Choose the method you want: HackMeMain_onCreate@VL
onCreate function

onCreate function

 
 I have “oranged” the relevant part. As you can see, an instance of HackMeMain$2 is instanced and moved to v3. The second line is just the constructor being called. On the last line, setOnClockListener is called with the object associated with the button view, obtained a few lines earlier as well as the HackMeMain$2 instance. So, we must now look at HackMeMain$2_onClick since the credentials’ verification is performed within that method.
 
And without further ado:
 
credentials' check
 
 So basically, the output of String.equals is checked agains zero. If it is zero, it jumps to the failure condition and the due screen is shown (roughly speaking). If you are familiar with Java, the equals method returns True or False which internally are represented as 1 or 0, respectively. In order to patch the program we can simply force a Jump to the success condition. Once you localise the file, you can pretty much break the check (almost) wherever you want:
 
Now, we recompile:
apktool.bat b [APPLICATION_FOLDER] -o ModifiedApp.apk

And we sign using our malicious key:

 jarsigner -verbose -sigalg SHA1withRSA -digestalg SHA1 -keystore hackme_testapp_evil.p12 ModifiedApp.apk hackme_testapp_evil 

Then, deploy:

 

adb -s emulator-5554 install -g [PATH_TO_MODIFIED_APK]\ModifiedApp.apk
 

 

Debugging

 Debugging in this case, where we can import the decompiled application to Android studio is relatively simple. Also, in this case, the credentials are hardcoded so if you can read Java code, you should be able to understand the verification mechanism without going through a debugging process. In any case, let us assume that the algorithm is much more complex and you would like to step through the verification using a debugger. Before debugging the application you can deploy using either adb or Android studio. Mind that if the application is protected using anti-reversing frameworks like Proguard, you will not be able to deploy the application through Android Studio. For those cases, you need adb. Once it is deployed you can attach the debugger through Android Studion. The Proguard case will make more sense when i dive into such topic. For now, let us keep it simple. With the project on Android Studio:

  • Click on the grey bar at a location aligned with the instruction where you want execution to stop:

Android Breakpoint

  • Run->Run ‘[application name]’ and select the device. Press OK
  • Confirm on the emulator that the application was launched
  • On top, click “Attach debugger to Android Process” and select the application. Keep the debugger type as Auto. According to the documentation, if you don’t have native code like C or C++, the debugger will only debug Java code, otherwise, it will switch accordingly (called Dual mode). You should see this:

Debugger is attached

  • Insert the username and the password and click Login. You should see:

Breakpoint Hit

 From this point on, just jump around using the typical instructions: Step Over, Step Into, Step Out, etc. The Variables window allows you to inspect the fields of the Java objects that are within the context of the analysed code.

Final Notes

 The example i have presented here was very simplistic. In the absence of an obuscator/anti-reversing mechanism, you disrupt the workflow of an application by modifying the Smali code or by creating a whole new application with Android studio and modifying to taste. Keep in mind you must sign the application before deploying it. If ProGuard was enabled on Android Studio the reversing process would have been much less smoother. I will dive into that subject on another post later. In any case, this should help you understanding the basics. From here, it is a matter of scaling/increasing effort and time to achieve your purpose.

Stay safe 😉

Android Reversing Part 2: Tools

Hello paranoids

 After reading my previous article, you should be ready to read this one. On this article, i will go through the most known tools to reverse Android APKs. I will split the article in multiple sections and provide tools according to a given motivation (no sense in providing tools without a use case). I will only provide information regarding free tools. Bear in mind that i have not used all of the tools yet since for the purposes of my learning i have not found them to be relevant. You can think of this post as a means to kickstart your Android reversing career.

Getting the APKs

First of all, we need to get the APKs. For now i am mostly interested in malware so i can get it while working 🙂 or through services such as VT (virustotal.com). Now, assuming we need to break or tamper with some legitimate application, there are multiple alternatives:

Using a real phone

Android Debug Bridge (adb): comes with Android Studio or SDK tools for Android. This tool allows you to communicate with an external device or Android emulator. I will describe it in more detail in a while but for now it suffices to know that you use it to get files from the device:

adb pull /path/to/apk/in/device/or/emulator /path/in/your/computer

File managers: Just search AppStore for APK extractors. The problem with these is that you have to install an app to extract another app. I would go for the adb so you get used to it (trust me you will need it).

Without a real phone

Automated Analysis

Disassembling

Getting the “Original Code” and resources

 The usage of “Original Code” requires some explaining. When it comes to compiled and interpreted languages such as Java, C#, the compilation process causes data to be lost (e.g. comments, lines of code). Decompilation tools show you an interpretation of the bytecode which may or may not be the same as the original code. Bear in mind that compilers tend to optimise the code you write. The more advanced and high-level the language the more optimisations are performed leading to code looking less similar to the original.

  • dex2jar (sourceforge.net/projects/dex2jar/): Dex2jar is a set of scripts with multiple capabilities (e.g. APK checksumming, disassembling) but the interesting script is the one used to convert .dex files to .jar files. From there,  JD-GUI can be used to look at decompiled  bytecode.
  • JD-GUI (jd.benow.ca/): Decompiler for Jar files. You can use it to obtain a readable representation of what may have been the original code. You can also choose to look at bytecode. You can use JD-GUI to export the decompiled .class files to Java files keeping the application hierarchy of packages. This is useful to then re-create the application on an IDE (e.g. Android Studio, Eclipse).
  • apktool (ibotpeaches.github.io/Apktool/):  Set of utilities (e.g. decoding of XML files for resources, decompilation to smali files). This tool can also be used to rebuild an APK from a folder containing decompiled resources and smali files. 
  • aapt: This is a tool to decode ARSC files and comes with Android SDK tools. I have found that apktool fails to do so.
  • AXMLPrinter2 (code.google.com/archive/p/android4me/downloads): You can use this tool to decode XML artifacts inside the APK (e.g. AndroidManifest.xml).
  • Androguard (github.com/androguard/androguard): Python framework to mess with APK files. The features of the framework are similar to the ones i have previously referred.

Debugging

When it comes to debugging, you may either step through Java code or Smali. In any case, you can use the same tools:

 

OS Distributions

 Santoku (santoku-linux.com/download/) is the typical Linux distribution for Android analysis. It is basically a swiss army knife  to hack the crap out of Android devices and applications.

 

Final Notes

 It is said that you are only as good as the tools you use. This post was meant to show you some tools you can use as an Android reverser. I have overviewed automated tools, disassemblers, decompilers, decoders as well as OS distributions. I hope you find this material useful for your hacks.

 

Stay safe 😉

Android Reversing Part 1: Internals

Hello paranoids

A few weeks ago i have taken on the challenge of learning how to reverse Android APKs. I have developed a bit for Android in the past but i have never delved that much into the Android world. However, as a reverser, i am always curious about the internals of other “binaries”.  As such, i am creating a small series of posts:

  1. Android Reversing Part 1: Internals
  2. Android Reversing Part 2: Tools
  3. Android Reversing Part 3: Tampering with Android Applications

Also, following the normal flow of my blog, i will reverse a real Android apk. At this point i am not sure if i will reverse a malware or just a commercial application protected with anti-reversing frameworks. Will see.

Since i am constantly learning and this is a new topic, i may update the posts overtime to reflect my findings. I am just starting these posts earlier since they provide me with a structured way of learning, by teaching. I will try to keep the posts light and straight to the point.

APK Anatomy

Let us take an example of a malware: 513fef5af719b6bb7d7760007aca2f49 a.k.a Android Autoinst. First of all, an APK is a ZIP file. You can open it with either 7-ZIP or with the Windows ZIP explorer. So, we have:

AutoInst contents

AutoInst contents

Sticking to the standard files/folders:

  • META-INF/: Contains files with SHA-1 digests of some of the the files within the APK, for integrity verification purposes. It also contains the certificate of the application. I have noticed that some files inside res/ (e.g. strings files) have no digest within the files i referred
  • res/: XML layout files and PNGs/JPEGs used by the application
  • AndroidManifest.xml: describes the application: permissions, activities (screen processors, roughly speaking), packages
  • resources.arsc: contains precompiled application resources in binary XML
  • classes.dex: compiled classes (our main target). This can be open using IDA

Other files/folders that may be present:

  • libs/: Native libraries
  • assets/: Similar to res/ but, while the files on the latter are “interpreted” by Android (e.g. taking into account the language or screen orientation/size), these are not. This folder may be used to store text files or other resources that are read by the application code (e.g. text file containing application configurations)

Let us ignore the other files since they seem specific to some compilation/packaging process.

Dalvik, DEX, ByteCode, Smali, Jasmin …

 While saying Android is Java is not correct (VM, GUI programming, classes, multithreading code differ), truth is, a Java developer can read code for Android relatively easy once it understands the workflow of Android applications. In terms of the operating system, Android’s kernel is a Linux kernel. As for the execution of applications, while JVM is used to run Java applications, Android applications are executed by the Dalvik Virtual Machine (DVM). Android applications’ files are first compiled to .class files using the typical Java compiler and then compiled into .dex files using dx tool (comes with Android SDK).

 If you try to open a .dex file you will find it to be unreadable. It is like opening a binary using hexdump. So, is there an assembly equivalent for dex? (remember that assembly is just a “Human Readable” representation of opcodes). Yes!

Some History

A few years ago, Jasmin (jasmin.sourceforge.net), a free open source assembler was developed by Jon Meyer and Troy Downing. The assembler would take an ASCII representation of Java classes, methods and fields and would compile them into .class files which would then be runnable using JVM. The assembler was developed at a time where SUN did not provide an assembler or even a standard language to represent the bytecode. As such, Jasmin has become the first assembler as well as a standard for the representation of bytecode in a readable way. Nowadays, tools such as Jasper (github.com/kohsuke/jasper), a Java .class disassembler as well as smali/baksmali (github.com/JesusFreke/smali), an assembler/disassembler for the DEX format use Jasmin’s syntax (disassemby of DEX files usingsmali/baksmali is typically called Smali).

Android Basics

Now, this part is a bit more boring and developer.android.com is your best friend. I am no Android developer but i am relatively familiar with the workflow of the applications. It is important to have a slight grasp of how Android projects are structured so you can look at a decompiled version of the APK.

Activities

Roughly speaking, Activities are classes that represent the logic behind the screens you see and interact with on an Android app. Activities have standard methods that are called to initialise the current screen interface. Whenever you want to jump to another screen or use an Android functionality (e.g. camera), you typically (unless this changed) need to create an Intent.

Views

Views are building blocks for Android application interfaces. Buttons, textboxes, progress bars are examples of Views. Views that trigger events (e.g. buttons) have handlers that are basically pieces of code that process interactions. Assuming a button, you define a function to handle clicks. The concept of events and handlers is pretty much used on every programming language that provides support for GUIs (e.g. C# and .NET). Views are identified through IDs which are in essence integers stored on R.java files.

Services

Code that executes in the background with no user interface. This is similar to Windows services. Services can be used to run tasks in the background decoupled from Activities. They are not the same thing as threads and the usage of one instead of other is off topic.

Manifest and Permissions

Manifest files are interesting pieces of information for reversing purposes. By analysing them you can infer what is the first activity being created when you execute the application (e.g. search for android.intent.action.MAIN and android.intent.category.LAUNCHER). Manifest file may have permissions which are helpful to tell you what the application requires (e.g. android.permission.SET_WALLPAPER allows the application to change the phone wallpaper).

Final Notes

On this first post i intended to overview the basics of Android applications and APKs. This will be necessary for you to understand the next posts. As i said at the beginning of this post, i am still learning about Android reversing so i may update this article in the future with more information.

Stay safe 😉

63 Problems But Malware Ain’t One: 8ca23d7bdf520c3e7ac538c1ceb7b555

Hello paranoids

Recovered from my previous post? No? Great! My overall objectives for the previous post were to:

  • Show you how to unpack a malware
  • Unpacking constructions (e.g. anti-debugging, shellcode, dynamic resolution of dependencies)

On this post, i intend to:

  • Go over some network/host tracks left by the malware
  • Malware supported commands and features

The IDA database resulting from this analysis will be added to my GitHub repository here so you can check out the comments i left there. You may find that some functions changed name when compared to the pictures i provide. However, if you understand what i am saying here, the comments and function names on the database should be clear. Every URL on this article will be defanged using [] to surround one of the dots.

Characteristics

MD5: 8ca23d7bdf520c3e7ac538c1ceb7b555
Family name: DoFoil a.k.a Smoke Loader (unpacked shellcode)
Packing algorithm: custom

Analysis Environment

Tools: OllyDbg, IDA Pro
Environment: VMware with 64-bit Windows 7

Functions

If you read the previous post, you should remember how i got the functions used by this piece of malware. The malware does load new dlls or calls standard functions in any shady way. As such, this list is accurate:

atoi
CharLowerA
CloseHandle
CoCreateInstance
CoInitialize
CoUninitialize
ConvertStringSecurityDescriptorToSecurityDescriptorA
CopyFileA
CopyFileW
CreateDirectoryW
CreateEventA
CreateFileA
CreateFileMappingA
CreateFileW
CreateMutexA
CreateProcessInternalA
CreateProcessInternalW
CreateRemoteThread
CreateThread
CreateToolhelp32Snapshot
CryptAcquireContext
CryptCreateHash
CryptDestroyHash
CryptGetHashParam
CryptHashData
CryptReleaseContent
DeleteFileA
DeleteFileW
ExitProcess
ExpandEnvironmentStringsW
FreeLibrary
GetComputerNameA
GetCurrentProcessId
GetCurrentThreadId
GetFileAttributesExA
GetFileSize
GetForeGroundWindow
GetModuleFileNameA
GetModuleFileNameW
GetModuleHandleA
GetProcAddress
GetProcessHeap
GetSystemDirectory
GetTempFileNameA
GetTempFileNameW
GetTempPathA
GetTempPathW
GetTokenInformation
GetVersion
GetVolumeInformationA
LdrGetDllHandle
LdrProcessRelocationBlock
LoadLibrary
LoadLibraryW
lstrCmpW
lstrcatA
lstrcatW
lstrcmpA
lstrlenA
lstrlenW
MapViewOfFile
MultiByteToWideChar
ObtainUserAgentString
OpenFileMappingA
OpenProcess
OpenProcessToken
OpenThread
Process32First
Process32Next
ReadFile
ReadProcessMemory
RegCloseKey
RegCreateKeyA
RegEnumKeyA
RegEnumValueW
RegNotifyChangeKeyValue
RegOpenKeyA
RegOpenKeyExA
RegQueryValueExA
RegSetValueExW
ResumeThread
RtAllocateHeap
RtReallocateHeap
RtlAddVectoredExceptionHandler
RtlComputeCrc32
RtlFreeHeap
RtlGetLastWin32Error
RtlGetVersion
RtlMoveMemory
RtlRemoveVectoredExceptionHandler
RtlZeroMemory
SHGetFolderPathW
SetFileAttributes
SetFileAttributesA
SetFileTime
SetKernelObjectSecurity
ShellExecuteW
Sleep
SuspendThread
Thread32First
Thread32Next
VirtualAlloc
VirtualFree
VirtualProtect
VirtualQuery
VirtualQueryEx
WaitForSingleObjectEx
WinHTTPCloseHandle
WinHTTPConnect
WinHTTPCrackUrl
WinHTTPGetProxyForURL
WinHTTPOpen
WinHTTPOpenRequest
WinHTTPReadData
WinHTTPReceiveResponse
WinHTTPSendRequest
WinHTTPSetOption
WinHttpGetIEProxyConfigForCurrentUser
WriteFile
WriteProcessMemory
wsprintfW
wsprintfA
ZwCreateSection
ZwMapViewOfSection
ZwQueryInformationProcess
ZwQueueAPCThread
ZwUnmapViewOfSection

According to this list, we can infer the following capabilities:

  • Networking: WinHTTPConnect, WinHTTPReadData
  • File system manipulation: DeleteFileA, WriteFile
  • Registry manipulation: RegCreateKeyA, RegOpenKeyA
  • File mappings : CreateFileMappingA, MapViewOfFile
  • Processes/Thread enumeration: CreateToolhelp32Snapshot,  Process32First, Thread32First
  • Hashing and integrity computations: CryptCreateHash, RtlComputeCrc32
  • COM: CoInitialize, CoUninitialize

Some of the functions referred previously were not used by the sample i had so, an analysis based on hardcoded addresses for standard functions is not enough.

String Decoding

URLs, HTTP parameters, registry keys, file names, folder names and so on are kept encoded internally. The strings are decoded by three functions:

URLs (decoded by “url_encoder_decoder”):

Beacon format strings, registry related data (decoded “string_encoder_decoder_1”):

  • “%d#%s#%s#%d.%d#%d#%d#%d#%d#%d”
  • “%d#%s#%s#%d.%d#%d#%d#%d#%d#%s”
  • http://www.microsoft%5B.%5Dcom/”
  • “Software\Microsoft\Internet Explorer”
  • “Software”
  • “Software\Microsoft\Windows\CurrentVersion\Policies\Explorer\Run”
  • “Software\Microsoft\Windows\CurrentVersion\Run”
  • “Microsoft One Drive”
  • “Software\Microsoft\Windows\CurrentVersion\Uninstall”
  • “sample”
  • “System\CurrentControlSet\Services\Disk\Enum”
  • “advapi32.dll”
  • “Location:”
  • 2015
  • “plugin_size”
  • “explorer.exe”
  • “%s%08X%08X”
  • “%08X”
  • “Work”
  • “user32”
  • “shell32”
  • “advapi32”
  • “urlmon”
  • “ole32”
  • “winhttp”
  • “HelpLink”
  • “URLInfoAbout”
  • “sbiedll”
  • “dbghelp”
  • “qemu”
  • “virtual”
  • vmware”
  • “xen”
  • “ffffcce24”
  • “svcVersion”
  • “Version”
  • “Version”

Path format string, paths, shell commands and HTTP parameters (decoded by “string_encoder_decoder_2”):

  • “%s\%s”
  • “%s%s”
  • “regsvr32 /s %s”
  • “%s\%s.lnk”
  • “%APPDATA%\Microsoft”
  • “%TEMP%”
  • “%CompSpec%”
  • “.exe”
  • “.dll”
  • “/c start %s && exit”
  • “:Zone.Identifier”
  • “GET”
  • “POST”
  • “Content-Type: application/x-www-form-urlencoded”
  • “runas”
  • String Decoding

Both “string_decoder_1” and “string_decoder_2” call “encodes_decodes_string_using_rc4” with different four bytes keys. It is a redundancy to say that the last two sets of strings are internally encoded using RC4.

Preamble and Process Hollowing

The malware starts by resolving the dependencies i have referred previously on this article and then proceeds to check the Windows version. If the operating system is Windows Vista or above, the malware leverages the Windows Integrity Mechanism. You can find lots of resources online explaining this mechanism. The Windows Integrity Mechanism is similar to SELinux Mandatory Access Control where an object of lower integrity can’t interfere with an object of higher integrity. The notion of separation between non-privileged users and privileged users (e.g. administrators) has existed in Windows versions previous to Vista (e.g. user process cannot read/write files from administrator). However, starting on Windows vista, even when you have an account with administrative privileges, you get a prompt (our beloved UAC) every time you attempt to execute a binary. This is the mechanism in action, which attributes a default medium integrity to the applications launched by authenticated users.

The malware checks the level of integrity it runs at and then sets creates an empty DACL with the SE_DACL_PROTECTED attribute enabled. This prevents any inheritance of security descriptor information from the parent process.

After the Sleep loop we have an if that checks whether the second argument for the main function is zero. If you remember from the previous post this argument is an address (250000h in my case):

main4

The reason for this check is only understood once you look at the left branch of the function. On the left branch we have:

Anti-analysis:

The method “checks_file_name_volume_loaded_modules_and_registry” (see picture below):

  • Checks whether the file name contains the word “sample”
  • Checks whether the volume serial is 0CD1A40h or 70144646h (volume serials for sandboxes)
  • Checks if sbiedll.dll (sandboxie) or dbghelp.dll (Windows DbgHelp library) have been loaded in memory
  • Queries “System\CurrentControlSet\Services\Disk\Enum\0” for the name of the primary volume and checks if the name contains: “qemu”, “virtual”, “vmware” or “xen”

If any of the above conditions is met, the malware enters an everlasting sleep.

Privilege elevation:

The malware then tries to elevate privileges (shell_execute) using a trick that involves ShellExecuteEx with a verb “runas”. The user will be prompted with the typical UAC box to authorise the elevation. This will only occur if the integrity of the process is below medium.

On “spawns_new_process_and_replaces_it_with_this”, the malware spawns explorer.exe and the malicious code from this malware is loaded into it. This technique is typically called “Process Hollowing” where a non-malicious process is spawned in suspended state and its content is overwritten with malicious code that is then executed when it is set to active. The current process then exists. I am sure you will never look at your explorer.exe process the same way.

The last point should make the right branch more clear. The right branch is executed by the newly spawned process. The picture below depicts the last chunk peace of the main function.

Analysing the right branch

Due to the complexity of this malware, i will focus on points that i consider essential and describe the overall operation of the binary. Once the process hollowing is finished, the malicious code (now executing inside explorer.exe) runs the code on the right side of the branch. The malware beacons http://www.microsoft%5B.%5Dcom/ to check for network connectivity and if the number of bytes read is less than ten or the connection and subsequent request fail, the malware sleeps and then retries later. Once connectivity is confirmed, the malware creates a mutex named as (padded_volume_serial = volume_serial_padded_with_zeroes_to_eight_digits) with the following structure:

[MD5([computer_name]45386319[padded_volume_serial])][padded_volume_serial]

If the malware cannot create the mutex because it already exists (host already infected), the malware posts (before existing):

2015#[mutex_name]#22222#[windows_major_version].[windows_minor_version]#[service_pack_major_version]#[integrity_level]#10001#13#0

to the remote URL:

http://hsbc-auth-2%5B.%5Dru/smk/index.php

integrity_level is 1 if integrity is below medium or 0 otherwise.

The malware then exits. If, on the other hand it proceeds (mutex is successfully created), it tries to achieve persistence.

Persistence Mechanism

It attempts to resolve the string “%APPDATA%\Microsoft” and then “%TEMP%” ExpandEnvironmentStringsW failed for the former. The chosen folder will host a copy of the binary with a name resembling the following structure:

[a-z]{8}.exe

The [a-z] is generated from the first 8 bytes of the mutex name and by another encoding function which maps the bytes to a limited set of characters (a-z). The malware then checks the following keys:

  • HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Policies\Explorer\Run
  • HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Run

In order to understand this step, i am going to provide an example. Let us say the host has a software X installed which must be executed on startup. This can be done by putting the main software binary on Windows startup folder or by creating a subkey under one or both of the previous registry keys. As an example:

HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Policies\Explorer\Run\X

This subkey would contain as one of the values the path for the binary belonging to X software (e.g. X.exe). Malware typically leverages the first and/or second keys to achieve persistence across host boots. This malware attempts to create a subkey under the first key. If this attempt was not successful, the malware attempts to create a subkey under the second key.

In order to be more stealth, it picks the name of a subkey of an already persistent application and uses it (by checking any of the Run keys). If the malware fails to get any name from the Run keys, it picks an application name from:

“HKEY_CURRENT_USER\Software\”

This would still be stealth since this key contains the name of software installed on the machine. If the malware fails to get a name from this last key it uses the name “Microsoft One Drive” (who is going to suspect Microsoft?).

If the malware fails to set any of the Run keys, it creates a .lnk file on Windows startup folder for the binary that, as previously referred will be either on “%APPDATA%\Microsoft” or “%TEMP%”. The malware then copies itself to one of these directories and spawns a thread that keeps the malware persistent (i have seen this behaviour before when analysed an Andromeda downloader). The malware then spawns a thread to beacon the server and deal with commands from the latter. As a curiosity, the file copy timestamps (MACE) are changed to the ones of “advapi32.dll”. This mechanism of timestamps tampering is called timestomping and it is used to confuse analysts which don’t expect MACE timestamps to be as old as the ones belonging to standard files such as Windows dlls.

Client-Server Interactions

The function responsible for this part is depicted below:

I am going to start with “beacons_server_and_processes_requests” which is the core function used to interact with the remote server. The function “load_dll_in_memory_and_call_export” will be described later.

beacons_server_and_processes_requests

If you remember what i referred previously, the malware beacons one URL before exiting if there is an ongoing infection. Once the malware is executing properly, it cycles across URLs using the URL generator below (“generates_beacon_url” on IDA database):

domain_generator

The malware stores the URLs encoded. The data stored on 295020h is used as an index to choose the encoded URL.”encoder_decoder” is the function used to decode urls (called “url_encoder_decoder” on IDA database). A search for the address 295020h on IDA gives us:

domain_generator_seed

The address is firstly accessed on “executed_by_spawned_process” where it is initialised. Besides “generates_beacon_domain”, only “beacons_server_and_processes_command” accesses the address (at least using the address directly). The variable stored on this address is changed and (consequently) the URLs are cycled by this last function under certain conditions (more about this later).

Then we have the function “opens_file_and_decodes_it”:

This function attempts to open the file “[a-z]{8}[a-z]{8}” which is present on either “%TEMP%\” or “%APPDATA%\Microsoft\” according to where the binary was copied to. As you can see the binary reads 15 bytes starting at a non-zero offset inside the file.  This function returns (on eax) either zero (no file) or a pointer to an hex array containing the 15 bytes read from it. This 15 bytes will be part of the beacon sent to the remote server to get a command. I assume this is a means to identify the victim. Another option would be this array as a means to authenticate the malware. However, this would represent a weak authentication since on the first interaction the malware would have no means to authenticate itself (the file does not exists).

“beacons_server_and_processes_command” is then called. This function is the biggest on the whole malware and, as such, i will not post pictures of  it. I will go through the parts that i consider important. As said on the previous paragraph, the malware request a command by beaconing the server. The structure of the beacon is the following:

2015#[mutex_name]#22222#[windows_major_version].[windows_minor_version]#[service_pack_major_version]#[integrity_level]#10001#0#[15_bytes_hex_array]

Once the command is received, we have the typical if-else construction to process the command. The server response may contain a dll which is stored on either “%TEMP%\” or “%APPDATA%\Microsoft\” with the name “[a-z]{8}[a-z]{8}” (sounds familiar?). In Dofoil terminology this is called a plugin which will be later loaded when the function “load_dll_in_memory_and_call_export” is called. This function simply maps the dll in memory, decodes it and calls one of its exported functions (called “Work”). The MACE timestamps for the new file are also tampered to be the same as the ones belonging to “advapi32.dll”.

As for the supported features, besides the typical ExitProcess (server orders the malware to terminate process) and the deletion of the malicious binary (used together in this case), as well as the plugin functionality, the malware supports four features which are related to the file type embedded on the server response (yes, there is another binary).  If the server embeds an executable, the malware will (depending on another response field) take one of the following actions:

  • Write the executable to disk and execute it using CreateProcessInternal
  • Map the binary to an allocated region of memory and call its entrypoint

If on the other hand, the embedded file is a dll, the malware will perform one of the following actions:

  • Write the dll to disk and load it using LoadLibrary
  • Write the dll to disk and register the former as a global dll using the command “regsvr32 /s”
  • Copy the dll into an allocated region of memory and call its entrypoint

Files that are written to disk are written using the following procedure:

  1. Use GetTempPathW and GetTempFileNameW to create a temporary file.
  2. Delete the created file
  3. Use the fullpath of the temporary folder, append “.exe” or “.dll” according to the embedded file and create a new file.
  4. Write the binary (.dll or .exe) to the file.

I have also noticed that the server may send some flags to the binary to be used after the process launching feature is used. One of the flags tells the malware to exit while the other tells the malware to delete the file written to disk (the downloaded binary). I assume these mechanisms are used to update the malware using an approach similar to the following:

  1. New version is downloaded to Temp folder
  2. New version is spawned by the current active malware.
  3. Active malware either exits or just deletes file on temporary folder.

Between the two options, my guess is that it is a matter of leaving traces or not. When the system boots, the old malware will not be spawned again and the new sample is executed instead.

URL Cycling

Looking at the disassembled function “beacons_server_and_processes_command”, the address 295020h (dictates the beaconed URL) is incremented when:

  • The  creation of the first beacon (used to get the command) fails on wsprintf (number of bytes written is zero).
  • If the first character on the server response is ‘<‘
  • If the value of the first dword on server’s response has a value greater than the size of the data sent on the beacon (the formatted string i have referred on 1.). Mind that when ii refer the “size of the data sent on the beacon” i mean the number of bytes outputted by the function wsprintf which is used to create the beacon. I assume this first dword tells the malware the number of bytes received by the server. If the server receives less bytes than the ones sent, this represents an erratic behaviour. The URL is changed and the routine returns.
  • If the CRC computed on the downloaded data fails. I did not refer this CRC feature but the malware seems to use the current URL string as a means to check the integrity of the response using RtlComputeCrc32.

Final thoughts

Dofoil/Smoke Loader is a nasty piece of malware. While the host related activity is relatively simple to analyse, the structure  of the requests and responses is cumbersome. This makes sense since Dofoil is not a complete malware but simply a modular vessel capable of deploying new malware, inject into processes and map malicious binaries into memory (binaries sent by the server). The nasty functionality comes from downloaded binaries and/or plugins.

I have relied heavily on static analysis and selective execution of malware sections of code. My objective was to extract strings and determine the overall behaviour of the malware (e.g. modified registry keys, created files, beacon structures). The malware had lots of runtime checks and complex constructions. As such, selective execution of code is desired to avoid waiting for breakpoints to be hit. Signing out!

Stay safe 😉

Unpacker for Hire: 8ca23d7bdf520c3e7ac538c1ceb7b555

Hello paranoids

As i referred on my previous post, i have started a Reverse Engineering/Malware Analysis journey. One of the topics i find most interesting about malware is packing. A packer is an algorithm that manipulates a simple binary and adds one or more of the following features (this is not an exhaustive list):

  • cryptors and/or compressors: e.g. to hide internal strings/shellcode
  • anti-debugging: e.g. to detect if the malware is being debugged
  • anti-vm: e.g. to detect if the malware is running inside a Virtual Machine
  • anti-dumping: e.g. to prevent analysts from dumping a malware by destroying binary headers/code.

Packed malware has, typically, a small amount of imports (such as LoadLibrary and GetProcAddress) which are used to resolve dependencies in runtime. The process by which a packed malware resolves its dependencies and restores its own malicious payload is called unpacking.

Enough 101. I am starting a series of posts called “Unpacker for Hire: [Malware MD5]” on which i will go through the process of unpacking some malware samples i find in the wild. This will be the first post of that series. Depending on feedback, the posts may lose or gain more detail. Unpacking does not require a massive post if i do it blindly but, if i want to understand what is going on, that requires deeper analysis and more time.

For the more experienced: you will not see me using fancy tools/plugins for the first set of posts since, for beginners, it is best to use as little automation as possible in order to understand the unpacking patterns and tricks.

I have recently discovered that the names you give to the pictures you upload appear when you hover your mouse. Some of the names i provided on my laptop are a bit off so please ignore those and stick to the picture descriptions. I also can’t seem to enforce different sizes for the headers of each section. Any word of advice for this poor noob is more than welcome.

I will describe the analysis process of the unpacked malware on a future  post and i may come back and edit this post to reflect my findings. I am going to split this analysis in four stages since the unpacking process is done in that manner.

Characteristics

MD5: 8ca23d7bdf520c3e7ac538c1ceb7b555
Family name: DoFoil a.k.a Smoke Loader
Packing algorithm: custom

Analysis Environment

Tools: OllyDbg + OllyDump, IDA Pro
Environment: VMware with 64-bit Windows 7

First stage

I always start by opening the malware with IDA and check the imports:

IDA view: malware imports

IDA view: malware imports

Does not look packed but it also does not look too harmful either. Some Certificate manipulation, file enumeration, registry manipulation. Let us look at the entrypoint:

IDA view: entrypoint for malware

IDA view: entrypoint for malware

The LocalFileTimeToFileTime and lstrcmp are omnipresent across the packed sample and appear to have no real usage (garbage code). IDA did not decode the chunks of bytes you see. loc_405B25 is called if the lower word for the stack pointer is lower or equal to 0xFE00. At the time of writing i am still not sure what is the reason for this verification. I would guess the lower word for the stack address varies across Windows versions, which would allow for a simple verification of the OS version. If you know what is this verification, please, feel free to drop a comment below. If i find out in the future, i will let you know. loc_405B25 is depicted below:

Call to loc_405B25

Call to loc_405B25

This pattern is repeated over and over again (like a Russian doll) with slight changes on some instructions. This seemed like a dead end. However, if the condition is not verified, sub_405065 is called. This happened on the test environment and on a Windows XP. I did not use any plugins or special configurations to avoid anti-analysis techniques so i assume this first verification is not such.

The function sub_405065 contains code used to load libraries. The dlls and functions to be imported and resolved are slightly “obfuscated”. For instance, an ‘a’ was prepended to the hardcoded string sycfilt.dll before calling LoadLibrary to load asycfilt.dll. Failing to load a given dll would lead to a jump to tiny routine that would set the ebp to the return of LoadLibrary (zero since it failed) and then return, which would crash the process later (stack cannot be based on 0x0 address). I have called failover to this tiny function but i later realized that this is not a failover mechanism (it does not recover).

asycfilt.dll, according to the table of exports, seems like a pretty useless dll to load and the next loaded dll (attempted load), gsycfilt.dll is even more strange since it does not appear to exist. For the latter, the failover function is called if the dll is successfully loaded. My guess is that the existence of the dll is a sign of infection and it signals the malware for the fact that the computer is already infected (just guessing). asycfilt.dll does not appear to be used later. My guess once more is that this dll is present only on some versions of Windows, being a means to test the version of the OS.

Next, the malware resolves ReadProcessMemory (after loading kernel32.dll) by deobfuscating wwwProcessMemory with some easy but geeky arithmetics. The malware then proceeds to get the relocation table address and size for asycfilt.dll and checks the size against an interval of possible sizes (first figure below). The address of VirtualAlloc is then resolved. At the top of the middle figure you can see a fancy call to VirtualAlloc with some arithmetics to hide its arguments. Obfuscation aside, the call is

VirtualAlloc(Null, 1672,MEM_COMMIT,PAGE_EXECUTE_READ_WRITE)

The last parameter can’t provide a more obvious conclusion: the malware is about to unpack some shellcode and write it to that section. The loop you see below the VirtualAlloc call is exactly that. The malware is looping across dwords, starting on 0x408E44 and applying a bunch of operations. I will not delve into this since the operations are straight forward and add no meaning to this post. At last (last picture), LoadLibrary address is pushed onto the stack and the execution jumps to the beginning of the decoded shellcode.

At this point, i strongly advise you to take snapshot of the VM because, an anti-VM/debugging technique on the next code and you have to redo everything. I like to look at IDA while i am debugging so, i need to dump the contents of the allocated section. Once you are at the beginning of the allocated section, go to memory view and double click the beginning of the section containing the code. Select the bytes -> righ-click -> Backup -> Save data to file. Open the .mem file with IDA in Binary mode.

OllyDBG view: dumping malicious section

OllyDBG view: dumping malicious section

Second stage

The shellcode uses LoadLibrary to get a handler for kernel32.dll. The name of the library is hardcoded but IDA interprets it as code at first. Below you can see part of the shellcode code. On the left is the interpretation of IDA, on the right is “my interpretation”.

The same behaviour is observed a few lines below:

The usage for these chunks becomes clear when they are looped through and passed on to the function below:

IDA view: code for resolution of malware dependencies

IDA view: code for resolution of shellcode dependencies

I have commented everything so you can understand what happens. In order to reach these conclusions, i advise you to open kernel32.dll on IDA pro and rebase it to be on the same address as in the debugger. From then on you must leverage IDA (with kernel32.dll + the code above open) as well as OllyDBG to reach my conclusions. Simply put, the malware loops through kernel32.dll exported names, computes a checksum for the strings representing those names (green rectangle depicts the algorithm) and compares the checksum with the one provided and hardcoded. Once it finds a match, it gets the ordinal for that function and uses it to index the exported functions table to get the address of that function. The mov before popa puts esi on eax (previously saved using pusha). Then, the location containing the hardcoded checksum is overwritten with the function address (stored on eax). The malware resolves the addresses for the following functions:

  • LoadLibraryA
  • GetModuleHandleA
  • GetProcAddress
  • VirtualProtect
  • VirtualAlloc
  • VirtualFree
  • CloseHandle
  • CreateToolhelp32Snapshot
  • GetModuleFileName
  • CreateFileA
  • SetFilePointer
  • ReadFile
  • GetCurrentProcessId
  • Module32First
  • Module32Next
  • GetProcessHeap
  • WaitForSingleObject

Once the resolution is performed, the malware checks the preamble for ReadFile function against 8Bh (mov?). Then, it proceeds to resolve some more functions (the names for those are hardcoded). We have an ending similar to the first stage: allocation of RWE memory and copy of shellcode to that address. The execution then jumps to that code:

IDA view: copying shellcode to allocated memory

IDA view: copying shellcode to allocated memory

We are done here. Time for VM snapshot.

Third stage

The third stage is a bit long. The new shellcode leverages some of the resolved functions to replace a big chunk of the malware binary loaded at first with the malware to be executed on the last stage.  The picture below on the left shows the first chunk of the shellcode obtained from the second stage. As you can see, the shellcode obtains a pointer for a memory location inside the .text section of the loaded binary. Some of the data in that location copied and decoded. The right picture shows the decoded contents.

Yes, the malware contained a binary inside itself. You can dump that binary using the process i described previously and open it with IDA. It is a well formed binary and IDA will not complain. IDA recognises the start function but nothing else. That start function is part of the last stage of the unpacking process.

The shellcode then proceeds to overwrite the necessary sections with the contents of the sections in the embedded binary (only .text is overwritten) and header adjustments are performed to comply with the embedded binary specifications. I will not delve into a thorough explanation of what is happening. The pictures below should suffice. The third stage is mostly overwriting shellcode with shellcode and make sure nothing blows.

The last jmp is used to jump to the entrypoint of the embedded executable, now part of the old malware .text section. If you have OllyDump, you will be able to dump the process easily once the EIP is at the beginning of the newly-decoded shellcode. IDA will accept the binary without complaining. Whoever wrote all this code deserves the slow clap.

Mind that, even though this is seems like a standalone executable, do not try to run it by itself since it has memory references that were adjusted by the unpacking algorithm. In my case, if the new executable is loaded at an address that is different from 0x00400000, the process will crash. The high degree of dependencies accross shellcode sections makes the analysis of this sample pretty cumbersome.

Fourth stage

We have reached the final stage of unpacking. OllyDbg won’t show you meaningful instructions when you do the Right-click->Analysis->Analyse code trick. As such, you will have to use IDA and rebase the code at the same address as the malware in memory. In my case, the malware image base is on 0x00400000. The first chunk of code is just a routine that xors an encoded segment of code with 0x313B6535. Then, the malware calls that decoded segment. See the pictures below:

You could either decode the shellcode with IDA scripting or let the malware do it itself and then dump the process again.  The shellcode is now recognisable by OllyDBG if you use the code analysis trick.  The shellcode that is executed next is quite interesting. First, it leverages IsBeingDebugged flag and NtGlobalFlags to detect debugger activity. If you have anti-anti-debugging plugins, you should be ok to jump directly to the end. Otherwise, i recommend, either patching or setting breakpoints on the lines with calls and setting the eax registry to zero (no debugger).

IDA view: anti-debugging techniques

IDA view: anti-debugging techniques

Once you pass this phase you may now ask. Are we there yet? Well…no. Another chunk of code separates us from the real deal. This malware has a tricky MO. It contains tons of shellcode that is unpacked in runtime and jumps from memory section to memory section. During this last part, the malware starts by unpacking a chunk of shellcode to an allocated section of memory (wow!). Part of that shellcode contains the unpacked payload that we want. It also contains parameters to set up the latter properly. Let use see some IDA, shall we?

IDA view: beginning of shellcode

IDA view: beginning of shellcode

This is the first part of the unpacked shellcode. ebx is still fs:[30h]. As such, the last line appears to mean: fs:[30h]->PEB_LDR_DATA->InLoadOrder->SizeOfImage + fs:[30h]. Drop a comment below if you know what is happening here. Then we have the called function:

From top to bottom, left to right:

1.The running shellcode performs a first decoding of another chunk of shellcode using xor operation with key 0x313B6535. The decoding is in place.

2.A second decoding is performed and the contents of the shellcode are copied to an allocated memory section. Another memory allocation is performed which will hold the unpacked payload that we want. The unpacked payload is copied from the previously referred shellcode section.

3 and 4. The unpacked payload memory references are adjusted relative to the beginning of the binary in memory. Something tells me that the binary will be referenced later. Then, the malware jumps to the payload. Before that, it deletes all the code previously described as can be seen below.

OllyDBG view: memory state before jumping to malicious payload

OllyDBG view: memory state before jumping to malicious payload

Before jumping to the final shellcode, the shellcode pushes three values on the stack: a pointer to a string “22222” inside the loaded binary memory space, a pointer to the beginning of the memory section containing part of the old shellcode and zero. Before jumping or at the beginning of the new shellcode, i recommend you make another snapshot. The real analysis starts now but it will not be part of this article since most of you may be lost or asleep by now.

What’s next?

At this point the malware is unpacked and will execute the main payload. However, it will not make it easy on you. There is no point in using OllyDump to dump the binary because the code is in another section. You will have to dump the code using the procedure i described before. However, you should not dump at the beginning of the code. Take a look at the pictures below:

As you can see, the malware performs the address resolutions for some functions. The resolution appears to be similar to what we have seen before: computing a checksum on the name of the functions that are part of the dlls in memory and compare that checksum with an hardcoded one. Once you run this function, you can dump the memory section containing the shellcode.

If you use just a standalone debugger or if you use IDA with one of its debuggers, you can skip this paragraph. If, as me, you switch between IDA and OllyDBG, or use another combination of disassembler and debugger, you will have to rename the function offset names to something meaningful. The way i see it, you can do this in one of two ways:

  • Go through the offsets table and for each offset go to (in OllyDBG) View->Executable modules. Find out where is that offset in the list of dlls, right-click the dll->View names. Search for the offset and you have the name. Change the name given by IDA to that offset. Now rinse and repeat.
  • Find the name of the function using the procedure on 1. Create an IDA script that goes through the table and renames the offsets to reflect your findings.

I intend to make another post for the analysis of the unpacked malware on another post but for now, we end here.

Final thoughts

If you are familiar with WinDBG, you know that i could have suffered less if i had used it. This happens because WinDBG is able to show you OS internal structures and offsets which would have helped a lot in this case. I have not used it because i am more used to ImmunityDBG or OllyDBG but i hope to change that soon.

Some of the unpacking steps could have been skipped and the article would have been shorter. However, it is important to be comfortable about all these layers of unpacking. The analysis of this malware is far from trivial since it has a bunch of shellcode and i may have left you pretty confused. Also, the IDA views i posted with comments are meant to help you understand what is going on, assuming you understand my comments. Feel free to leave some feedback about the overall structure and contents.

Stay safe 😉

Back on track and some word of advice

Hello paranoids

You: What the hell happened to you?

You see, i typically say i am lazy, even though i am not (too much…don’t judge me). I keep doing my stuff, working out and learning as much as i can about security (stopping depresses me, not stopping drains me, decisions decisions). However, i have two major problems: lack of time management capabilities and an everlasting need to try/learn new stuff.

I like to write and teach, which explains why i started this blog. Yet, i like to know what i am talking about before i teach anything and i am never satisfied with the depth of my knowledge (call it low self-esteem). Since i have had nothing meaningful to write about, this blog has been quite empty.  Also, a lot has happened since my last post:

  • I finished my thesis (it was about protecting PaaS services against malicious administrators). I can finally call myself an engineer (bow before me minions!).
  • I am working for

fireeye-2-color

         as an Information Security Analyst (at Dublin’s SOC)

  • I moved from Portugal to Dublin

The first point is cool and stuff but, unfortunately (for me), not really valued by good IT companies. I may post about it later but, for now, let us focus on the FireEye thing.

If a fellow Portuguese is reading this post, he/she will probably relate when i say that Portugal (at time of writing) is ruled (in terms of IT employment) by consulting companies and security-related jobs are pretty bad. So, i would be basically working in boring projects, being exploited by consulting companies and complaining about all of this every single day. I searched a lot and sent my CV to many companies: FireEye, Facebook, Google, Amazon, PaloAlto, Fortinet, RSA, BT. In my country, i spammed every single telecommunications provider, bank and supermarket chain. Truth is: hardly any of these entities hire directly (they typically hire consulting companies which, in turn, hire people).

FireEye got me first and I was super excited when i got my first email from them. The whole recruiting process was smooth and handled by extremely nice and professional people (expectations met). At that time, i was a bit slightly sad because i wanted to go to the US (cliché, i know). Calisthenics gives me the freedom to workout outside but I knew Dublin’s weather was bad (confirmed!), which would probably mess up my mood and my willingness to workout. Still, i knew that Portugal was not the way to go for someone passionate about information security. So i did what i never thought i could do: accept the job offer and move to Dublin.

If you live in a country where the economy is plain bad, you are encouraged (pretty early) to leave it and go abroad, to look for companies and people that actually care about you, and can provide you with new and meaningful challenges. However, as an IT expat i must warn you:

“Leaving alone abroad is no easy task.”

This is my first experience (6 months on 15th August) abroad, which may explain some misconceptions i have and some mistakes i am making but, depending on the type of life you had back at home and your objectives, leaving your country may be worthwhile or a plain waste of time.  If you want to leave your home land just for the experience, and you want comfort and fun, then do not leave it with the objective of saving money, you will be disappointed. If you are on a tight budget and you left your country for money and CV purposes, then be prepared for restrictions: small house, cooking a lot and if you get a studio (as i did) be ready for some dish-washing madness. Leaving your home country require a lot of will power and sacrifice. Hope for the best but be prepared for the worst.

But, advice and complaints aside:

What am i up to now?

I have been through lots of phases in terms of learning: pentesting, networks, programming, forensics, etc. Without going into further detail, i can tell you my job requires heavy forensics. Finding evil baby!!!

I like lots of areas and i get bored easily. I used to jump from subject to subject and never got anything done. I have been reading Practical Malware Analysis for a few months and i have to tell you, i am quite happy with what i am learning so far. I have been encouraged by a fellow Portuguese friend to dive into reverse engineering, assembly and malware analysis. I have had experience with assembly in the past. However, i was very afraid that i needed to know lots of low level stuff. Bear in mind that i am just a grasshopper and i may be simplifying stuff. However, i find the book quite easy to follow (both theory and labs) and i have managed to stay focused on this subject so far: no more drifting away, and i am not even bored.

With this i conclude my post. I intend to address malware analysis and reverse engineering on future posts. Until then

Stay safe 😉