With the constant rise of cyber-attacks, understanding the type of threat your organization faces is a vital step towards stopping the attack. When it comes to malicious software threats, classifying malware can help answer the following questions:

  • Was this a targeted attack?
    • If so, who is behind this attack?
  • How can we detect and stop this type of attack?
  • What is the extent of the damage?

Some sandboxes are able to classify the malware sample, but what happens when the sandbox results come back empty? Analysts can search the web for their cryptographic hash while hoping someone has already stumbled upon this specific malware. Is there a way to compare the similarity of a binary against previously analyzed samples? Yes, there is.  The following techniques will help compare a binary against previously classified samples:

  • SSDeep
  • Imphash
  • Section hash
  • Yara rules

Using a combination of these will ensure the best results. To demonstrate these tools and techniques, the following sample was retrieved from VirusTotal with the hash:

d18d211cf75fbc048d785af92b76a1aa7a01e381313b1a5e66e9cf564cbe78d4

Figure 1: VirusTotal results for the listed hash

For future reference, this sample will be called Sample 1. We will first look at how these tools and techniques work and then we will compare and classify this sample to previously analyzed samples. Examining the DETECTION section in VirusTotal does not give us much information on the sample’s malware family. However, the DETAILS section lists the Imphash, fuzzy hash, and the section hash which is a great start.

Figure 2: VirusTotal Details section for the listed hash

SSDEEP

SSDeep is a tool that generates context triggered piecewise hashes (CTPH), also called fuzzy hashes. CTPH can match inputs that have homologies, or similar relations. Such inputs have sequences of identical bytes in the same order, although bytes in between these sequences may be different in both content and length. The ssdeep tool can be installed locally for Windows and Linux platforms. For information on installing ssdeep, head over to their site.  If you have a directory full of previously analyzed samples, the analyst can use the following commands:

$ ssdeep * > fuzzy_hashes.txt

$ ssdeep -m fuzzy_hashes.txt malware.exe

The above command first writes all fuzzy hashes in the working directory to a file named fuzzy_hashes.txt. The second command compares the fuzzy hash (using the matching mode, -m) of malware.exe to the list of previously analyzed samples. Using ssdeep -h will show all the commands that can be used.

IMPHASH

The imphash (Import Hash) is calculated based on the libraby/API names and their specified order within the executable. A quick way to calculate the imphash is to load the binary into PeStudio, as seen below.

Figure 3: PEStudio showing the imphash of a suspect binary

Section Hash

Scrolling down on the DETAILS section of the VirusTotal page, we can see the Sections hash listed. Similar to the imphash, section hash can help identify similar samples.

Figure 4: VirusTotal showing the Sections hash of Sample 1

Comparing and Classifying Sample 1

Searching for the fuzzy hash in VirusTotal, results in 20+ samples.  Looking at the similarity column shows that the first sample has 100% similarity, which makes sense because this is our initial sample. The second sample has a 97.59% similarity, The higher the number the more similar the files.

Figure 5: VirusTotal showing the fuzzy hash search results

Comparing this 2nd sample to Sample 1, we can see the similarity. Both samples contain the same imphash of 1656aa7aa811a8db1ecbc8983c084712. This means they have the same import address table and were generated by the same builder kit.

Figure 6: VirusTotal showing the Imphash of Sample 1

Figure 7: VirusTotal showing the Imphash of Sample 2

Several section hashes match the hashes of Sample 1, 4 out of 5 sections to be exact.

Figure 8: VirusTotal listing the Section hashes of Sample 2

Figure 9: VirusTotal listing the Section hashes of Sample 1

At the time of this writing, a web search for Sample 1 came back empty. A google search of Sample 2 (73849ce478a894f10589cc31aece7dcb8a39c1c43a4a5c401b2dae86b53bb9c7 listed in VirusTotal during our ssdeep search) with a 97.59% similarity, resulted in the following tweet from Vitali Kremez, a well-known security researcher.

Figure 10: Vitali Kremez retweet

It can be seen from the screenshot that this sample was classified as Dridex. When comparing and classifying malware samples, its important for the analyst to correlate findings with various sources and tools. This brings us to our last topic: Yara rules.

Yara rules

Yara is used by security researchers as a powerful malware classification and identification tool. Once yara is installed on the analyst’s system, rules can be created based on text or binary information contained in the malware sample. A rule consists of three sections:

  • Rule name – Name to identify the rule
  • Strings – This section contains text or binary information. There are several types of strings the analyst can look for:
    • Text strings, with modifiers: nocase, fullword, wide, and ascii.
    • Hexadecimal, in combination with wild-cards, jumps, and alternatives.
    • Regular expressions, with the same modifiers as text strings.

There are many more advanced conditions you can use, but they are outside the scope of this post. More information on the use of YARA can be found in the YARA documentation.

  • Condition – This section contains a Boolean expression, which will specify the condition for the rule to match.

Rule Example

{

    strings:

        $a = “text1”

        $b = “text2”

    condition:

        ($a or $b)

}

To apply what we have learned, let’s create a yara rule that can be scanned against Sample 1. From Vitali’s retweet, we know the 2nd sample has the following indicators of compromise (IOCs):

  • Dridex payload URL: hXXp://yumicha.xyz/lvkahex.exe
  • Botnet ID: 40400
  • Dridex C2s:

213.136.94.177:443

217.20.166.178:4664

37.205.9.252:8443

70.39.251.94:3889

To create this rule, the known Dridex C2s will be used.

Rule dridex

{

    strings:

        $a = “213.136.94.177”

        $b = “217.20.166.178”

        $c = “37.205.9.252”

        $d = “70.39.251.94”

    condition:

        ($a or $b) and ($c or $d)

}

Running this rule in the terminal against a memory dump from Sample 1 results in a match. Running yara with the -s option, prints the matched strings. From the screenshot, the newly created rule matched on all the dridex C2s from the 2nd sample.

Figure 11: Terminal output of Dridex yara rule scan against Sample 1

As malware authors often make small changes to their samples to thwart analysis, the techniques and tools discussed in this blog will help in classifying and comparing new malware samples. It is a good practice to generate the fuzzy hash, section hash, and imphash of all malware samples you analyze and store these hashes in a repository. Storing these hashes will allow the analyst to compare a new sample and determine similarity.

If your organization needs managed detection and response across both your cloud and on-premise environment, Fishtech Cyber Defense and Response (CYDERES) can help. CYDERES offers a better, faster and scalable SOC with a managed SIEM. CYDERES solutions include:

  • Enterprise Managed Detection & Response
  • Global Security Operations Center
  • CYDERES Cloud
  • CYDERES Security Incident Response Team
  • CYDERES Red Team

To find out more about what these solutions entail, head over to our services link.

References

https://ssdeep-project.github.io/ssdeep/index.html