Lab 02.B — Hash Verification & Threat Intelligence

Date: 30 April 2026 Author: Emilio Mardones (Ofendor) Status: ✅ Completed Module: Basic Static Analysis — Sikorski Ch. 1, Module 102 Related: Lab 02a — Sample Acquisition | Lab 01 Setup Notes

1. Overview

This section showcases my personal malware analysis workflow involving hash verification and threat intelligence research, based on Practical Malware Analysis: Lab01-01.exe and Lab01-01.dll. Both files were confirmed malicious, packed with a known file packer called Armadillo, and the executable is classified as a Trojan that uses DLL search order hijacking through typosquatting to persist across the infected system. It uses several anti-analysis evasion methods that I’ll explain during this lab.

Note: First stage of analysis is overall passive which queries data from external databases only. Deployment will be held in the next repo.

2. First Analytical step: Hashing

Before I ran a single analysis tool, the first thing I did was generate cryptographic hashes for both files. This might seem like a superficial step, but it is foundational to everything that follows according to my research and readings.

A cryptographic hash function takes a file as input and produces a fixed-length string which is also known as fingerprint which is mathematically unique to that exact sequence of bytes. This is important for you to remember, because these is used nowadays to analyse sequence of bytes against pattern behaviour.

As Sikorski and Honig (2012) explain in Chapter 1, hashing serves two essential purposes in malware analysis: sample identification and integrity verification. If even one byte of the file changes, the hash changes completely. This means that once I record the hash, I can prove that I analysed the exact same file and not a corrupted copy or modified variant with the same name.

Bibliography labels it as chain of custody, which documents who handled evidence, when, and in what state. Kleymenov and Thabet (2019) describe this as non-negotiable in any SOC analysis workflow that might produce findings used in a formal context for further research and Threat Intelligence databases, which helps for active response.

There is also a practical threat intelligence dimension. VirusTotal, MalwareBazaar, and similar platforms index samples by hash. The MD5 or SHA256 value I generate locally was my search key into global malware databases. Without hashing first, I cannot do the necessary steps that follow.

3. MD5 and SHA256 Hashes

To start the analysis I positioned myself withing the directory that hold Chapter_1L lab items on REMnux (192.168.100.1) VM.

PowerShell pre-installation verification results

Use this as a guideline. Note: file Hashes.csv will be used later on, but it is important for you to notice its location right now

Output confirmed:

total 224K
-rw-rw-r-- 1 remnux remnux 160K Dec 19  2010 Lab01-01.dll
-rw-rw-r-- 1 remnux remnux  16K Jan  8  2012 Lab01-01.exe
-rw-rw-r-- 1 remnux remnux 3.0K Jan 19  2011 Lab01-02.exe
-rw-rw-r-- 1 remnux remnux 4.7K Mar 26  2010 Lab01-03.exe
-rw-rw-r-- 1 remnux remnux  36K Jul  5  2011 Lab01-04.exe

Two file size observations are worth noting: Lab01-01.exe is only 16 KB, unusually small for a Windows executable. Lab01-01.dll is 160 KB which is ten times larger. Muralidharan et al. (2022) specifically identify small file size, a small number of Portable Executable (PE) sections, and a small number of import functions as heuristic indicators of packed executables, all three of which I expect to confirm when I run PE header analysis on the Static VM. The size disparity between the two files also makes intuitive sense once you understand the relationship between them: the executable is a dropper/launcher packed down to its minimal stub, and the DLL carries the actual payload. I will come back to this when I discuss the mechanism.

Hashing Commands

Output:

bb7425b82141a1c0f7d60e5106676bb1  Lab01-01.exe
290934c61de9176ad682ffdd65f0a669  Lab01-01.dll

58898bd42c5bd3bf9b1389f0eee5b39cd59180e8370eb9ea838a0b327bd6fe47  Lab01-01.exe
f50e42c8dfaab649bde0398867e930b86c2a599e8db83b8260393082268f2dba  Lab01-01.dll

I used both MD5 and SHA256 instead of just sticking to one. MD5 is quick and still widely supported across a lot of threat intel platforms, so it’s useful for fast lookups. SHA256 is more secure and way less likely to generate the same hash for different files, so it’s the better option when accuracy matters.

For sharing indicators or documenting findings, there are other hashing algorithms I could’ve used, but for this lab I kept it simple and focused on these two.

4. Validating Hashes.csv output

As mentioned during step 3, the lab repository includes a Hashes.csv containing the expected MD5/ SHA256 values for every lab binary:

Relevant output from Hashes.csv:

1,"BB7425B82141A1C0F7D60E5106676BB1(EXE) 290934C61DE9176AD682FFDD65F0A669 (DLL)"

Result: PASS. Both samples match the repository’s expected values exactly. The files were not corrupted during download or extraction, which is great to jump into the next step.

5. Threat Intelligence

As a SOC Analyst you need to understand the concept behind Threat Intelligence and its importance. Threat actors can be persistent and they use a variety of tactics and techniques to compromise systems. One Malware alone could deploy several procedures at the same time. Given the risk these threats present, NIST SP 800-150 (2016) points that it is important that organisations share cyber threat information and use it to improve security posture.

Sharing communities help professionals to leverage knowledge, experience and capabilities to gain a complete understanding of the threats the organisations may face.

Have in mind that database lookup over hashes might collide and output mistaken results, that’s why is essential to look into other hashing alternatives so you get useful and accurate information.

Note: I repeat, this is passive intelligence gathering only, I am not submitting the files, only querying by hash. The files never leave REMnux during this step. All information was done from the VM by enabling Adapter 1 with NAT connection. Keep your clipboards and Bidirectional features disable at all time.

5.1 Virus Total: Lab01-01.exe Analysis

Summary:

SHA256:           58898bd42c5bd3bf9b1389f0eee5b39cd59180e8370eb9ea838a0b327bd6fe47
File name:        Lab01-01.exe
Size:             16.00 KB
Detection:        56/71 security vendors flagged as malicious
Community score:  +18 (confirmed malicious)
Popular label:    trojan.ulise/aenjaris
Threat category:  Trojan
Family labels:    ulise, aenjaris, kkbov
Last analysis:    8 hours ago

Other Indicators of Compromise (IOC) I’ve considered relevant:

Tag	Meaning
`peexe`	Is a PE executable format that affects Windows systems
`armadillo`	Packed with Armadillo file packer
`via-tor`	Attempts communication through Tor network
`checks-user-input`	Waits for mouse/keyboard activity before executing
`detect-debug-environment`	Actively checks for debuggers and analysis tools
`checks-disk-space`	Queries available disk space as sandbox detection
`long-sleeps`	Deliberately delays execution to outlast sandbox windows

This analysis showed more significant and sophisticated mechanisms than those I was expecting from a Chapter 1 training sample.

According to the analysis, I’ve found a file infector and system hijacker that employs DLL search in order to hijack Windows through typosquatting. Then, it copies a malicious payload (Lab01-01.dll) to %WINDIR%\System32\kerne132.dll (note the wrong typo within the name, very smart, right?) by mimicking the legitimate kernel32.dll. The malware then recursively scans the C: drive for executable files and modifies their PE headers, specifically patching the Import Address Table (IAT) to replace references to kernel32.dll with the malicious kerne132.dll. This ensures the malicious library is loaded whenever infected applications are executed.

Note: kernel32.dll is a critical 32-bit DLL file in Windows OS that manages essential system tasks such as memory management, input/output operations, and process creation, allowing applications to interact with the OS (Chen R., 2023 - Microsoft Blogs).

5.2 Virus Total: Lab01-01.dll Analysis

Summary:

SHA256:           f50e42c8dfaab649bde0398867e930b86c2a599e8db83b8260393082268f2dba
File name:        Lab01-01.dll
Size:             160.00 KB
Detection:        38/72 security vendors flagged as malicious, CloudStrike with a confident 100%
Community score:  -111 (strongly confirmed malicious)
Popular label:    trojan.skeeyah/genericrxfo
Threat category:  Trojan
Family labels:    skeeyah, genericrxfo, r002c0phf20
Last analysis:    10 days ago (the last time someone queried for this hash)

The AliCloud classification of Backdoor:Win/AgentILW.BOM is the most curious finding during this analysis. While other classify it as a trojan, the backdoor designation confirms that this DLL provides remote access or command execution capabilities to the attackers once it has been injected into the victim system via Tor as described in the previous .exe analysis.

I noticed there was disparities in the detection rate for the DLL compared to the EXE file (38/72 vs 56/71) which was due to the Armadillo packing the hackers used. What does this means?: the DLL’s malicious code is compressed at rest due to Armadillo capabilities, making signature-based detection harder for analysts. Bibliography from Šťastná and Tomášek (2016) demonstrated precisely this problem: packed files can defeat signature-based detection systems even when the underlying malicious code is well known, because the packer transforms the syntactic form of the binary without changing its semantic behaviour.

5.3 Anti-Analysis Techniques

Before this lab, I knew anti-analysis techniques existed but seeing them as tags on a real sample made them much more explanatory to me. I decided to divide them into five techniques which are present in Lab01-01.exe file. This was a stepping-stone for me because now I can understand the strategies attackers might implement. Remember, this is my personal approach. Professional environments have their own triage strategies to understand malware behaviour:

Layer 1 — Packing (Armadillo)
  Code is encrypted at rest
  Signature-based detection fails
  Static analyst sees the packer stub, not the payload

Layer 2 — Debugger detection
  Checks for analysis tools at runtime
  If debugger found → malware exits or behaves differently

Layer 3 — Behaviour durint user interaction
  Waits for real mouse/keyboard activity
  Automated sandboxes often produce no user input
  Malware stays dormant until it detects a human moving the mouse for example

Layer 4 — Disk space check
  Real systems have hundreds of GB used
  Sandboxes often have minimal disk usage
  Low disk space → likely a sandbox → stay dormant (yeah, malwares are not stupid)

Layer 5 — Long sleeps
  Deliberately delays execution
  Most automated sandboxes observe for 2-5 minutes
  Malware waits longer than the observation window which makes unskill analysts to   overall fail their response and tag them as false positives. 

Cucci (2024) notes that modern malware rarely relies on a single evasion method. The layered approach where each technique independently increases the probability of evading detection, is characteristic of professionally developed malware rather than script-kiddie tools. It also helps you to understand your opponent and the level of competency you should have to mitigate their attacks. Now you know why Threat Intelligence gathering is so important during malware analysis.

6. Other sources: JoeSandbox and TrendMicro

Beyond VirusTotal, I queried two additional platforms to cross-reference findings.

JoeSandbox Cloud

Summary:

Platform:     JoeSandbox Cloud Basic
Sample:       Lab01-01.exe
Analysis ID:  344102
MD5:          bb7425b82141a1c...
Score:        56 / 100
Verdict:      MALICIOUS
Confidence:   100%
Whitelisted:  false

→ Creates a DirectInput object (often used for capturing keystrokes)
→ Program does not show much activity if IDLE so remains dormant
   (anti-sandbox: waiting for user interaction)
→ Uses 32-bit PE files (confirming that affects Kerbero132.dll)

The DirectInput object creation is interesting because is a Windows API component used by games and applications to capture keyboard strokes and controller input, strongly suggesting keylogging capability, which aligns with the backdoor classification seen in the DLL analysis on VirusTotal.

The IDLE signature confirms the user interaction check evasion technique identified in VirusTotal’s tags where the sample was observed sitting dormant during the automated analysis, waiting for input that the sandbox did not provide.

TrendMicro Threat Encyclopedia

Official name:  Trojan.MSIL.AENJARIS.A
Aliases:
  → Trojan.MSIL.Agent (detected by IKARUS)
  → Trojan:Win32/Aenjaris.ROC!MTB (MICROSOFT defender)
Platform:       Windows
Date:           January 29, 2021
Risk ratings:   Damage potential — MEDIUM
                Distribution potential — LOW-MEDIUM

The MSIL designation (Microsoft Intermediate Language) in the TrendMicro alias is worth noting because it suggests at least part of the malware was compiled from a .NET language such as C#, so i assume this will help for reverse engineering. .NET assemblies can often be decompiled back to near-source-code quality. This is something I will explore later on this lab.

7. Key Concepts

During this Lab I encountered several terms that were worth researching to understand malware capabilities.

Malware Packers

A packer is a tool that compresses, encrypts, or otherwise transforms an executable so that its original code is not visible in its stored form. When the packed executable runs, the packer’s stub code decompresses or decrypts the original code into memory and then transfers execution to it.

Šťastná and Tomášek (2016) describe packing as serving two purposes in malware: reducing file size (the original commercial use case) and obstructing reverse engineering, signature-based detection, and static analysis. This nature of packer tools create detection challenges for analysts because both legitimate software and malware use packers, meaning the mere presence of a packer does not confirm malicious intent and it could be ranked as a false positive. So I assume that the results from Virus Total disparity tank pretty much shows 15 vendors that were likely defeated by the Armadillo layer the malware has. This bibliography also found that 81 out of 100 randomly selected harmless software samples were packed by at least one known packing tool, demonstrating that packer detection alone is an unreliable malware indicator. As a SOC analyst you should never oversee this warnings and take is as expected for further analysis.

Muralidharan et al. (2022) analysis of 2,000 randomly selected PE files from VirusTotal between 2010 and 2021, show that on average, 21.35% of all PE files across that period were packed. This is critical to know because the use of packing has increased in recent years as machine learning-based detection mechanisms have become more effective against non-packed malware, pushing threat actors back toward packing as a primary evasion strategy.

This has a direct connection to what I observed in this sample. The 16 KB file size of Lab01-01.exe is itself a packing indicator. Muralidharan et al. (2022) specifically identify small file size, a small number of PE sections, and a small number of import functions as indicators of packed executables, all artefacts of the Armadillo packer rather than the actual malicious code.

For the analyst, the immediate consequence is that static analysis of a packed binary reveals the packer stub rather than the malicious code. The actual payload only becomes visible after the sample runs and unpacks itself in memory, which is why dynamic analysis and memory forensics are necessary later in the workflow. Muralidharan et al. (2022) explained that most detection methods in the literature fail entirely when a packer implements runtime evasion routines from which, as the VirusTotal tags confirm, Armadillo does.

Armadillo has been widely adopted by malware authors precisely because of its effectiveness at defeating analysis. Ugarte-Pedrero et al. (2016) classify it at the highest level of packing complexity, describing it as using a techniques that partially reveal code. Unlike simple packers that unpack the entire payload into memory at once, Armadillo decrypts only the specific region of code that is about to execute, then re-encrypts it when execution moves to a different region. This pattern teaches us that even during dynamic analysis, a single execution run may not reveal the complete unpacked code but only the paths that were actually executed during that specific run.

Particularly effective when combined with the anti-sandbox techniques because the malware detects a sandbox environment and stays dormant in purpose, so Armadillo will never decrypt the payload sections that require user interaction to trigger, leaving those portions of the code completely hidden from the analyst. This is also known as obfuscation.

In practical terms, this means that when I reach the dynamic analysis phase, I will need to account for the Armadillo layer before I can meaningfully analyse the underlying payload and play around other techniques till fully get the information we need to understand this malware.

Typosquatting in Malware?

Yes, Typosquatting refers to registering domain names that closely resemble legitimate ones in order to capture traffic from users who misspell for example gooogle.com of google.com. Kaspersky (n.d.) defines it as a form of cybersquatting that exploits predictable human typing errors to redirect victims to malicious sites.

In malware, the same principle is applied at the file system level. Rather than mimicking a domain name, the malware mimics a critical Windows system file name. In this specific case:

Legitimate file: C:\Windows\System32\kernel32.dll
Malicious file:  C:\Windows\System32\kerne132.dll
                                           ↑
                              Letter 'l' replaced with number '1'

A casual visual analysis will not reveal to you the differences without careful inspection. The malware exploits this visual ambiguity to place its malicious DLL in the same system directory as the legitimate one. Minh Tien Truong (2023) says that typosquatting malicious packets imitate legitimate packages as their propagation method.

Overall, I’ve learned that this technique is significantly more dangerous than a simple file drop because it does not require registry persistence or other traditional persistence mechanisms. It can fool anybody. By patching the IAT of every executable on the C: drive to load kerne132.dll instead of kernel32.dll, the malware ensures that its payload is loaded by every application that calls any kernel32 API, which is effectively every Windows application.

The DLL Hijacking and IAT patching

Windows resolves DLL dependencies using a specific search order. So when an application calls kernel32.dll, Windows searches for the DLL in a defined PATH environment by default.

DLL order hijacking exploits this mechanism by placing a malicious instruction with the same name as a legitimate one in a location. The malware directly patches the Import Address Table (IAT) of victim executables to redirect searches to the malicious kerne132.dll.

Every Windows executable contains a data structure called the Import Address Table (IAT). When the program calls CreateFile, for example, Windows uses the IAT to find the actual memory address of that function in kernel32.dll.

If malware succeeds in modifying this DLL function, every time the victim executable is launched, it resolves the malicious DLL instead of the legitimate one. And since this malware has keylogger capabilities, steal credentials, establish backdoor connections via Tor, or perform any other operation before forwarding the call to the real kernel32.dll.

This takeaways explains why Muralidharan et al. (2022) highlight IAT reconstruction as one of the key technical challenges for analysts and suggests unpacking routine at IAT as the most efficient moment to identify the specific packer used.

8. Final IOC Table and nano notes

The following table confirms all indicators of compromise I found during this phase. These values can be used for threat hunting, detection rule creation, or threat intelligence sharing.

Field	Lab01-01.exe	Lab01-01.dll
MD5	`bb7425b82141a1c0f7d60e5106676bb1`	`290934c61de9176ad682ffdd65f0a669`
SHA256	`58898bd42c5bd3bf9b1389f0eee5b39cd59180e8370eb9ea838a0b327bd6fe47`	`f50e42c8dfaab649bde0398867e930b86c2a599e8db83b8260393082268f2dba`
File size	16 KB	160 KB
File type	PE32 EXE (targeting Window Systems)	PE32 DLL
VT detection	56/71	38/72
Threat label	trojan.ulise/aenjaris	trojan.skeeyah/genericrxfo
Classification	Trojan / File Infector	Backdoor / Trojan
Packer	Armadillo	Armadillo
Anti-analysis	Debugger detection, user interaction check, disk space check, long sleeps	Not identified during my analysis
Network	Tor communication attempted	N/A
Mechanism	IAT patching + DLL typosquatting	Backdoor payload (kerne132.dll)
TrendMicro name	Trojan.MSIL.AENJARIS.A	N/A
Microsoft name	Trojan:Win32/Aenjaris.ROC!MTB	N/A
JoeSandbox score	56/100 MALICIOUS (100% confidence)	N/A
Integrity check	✅ PASS vs Hashes.csv	✅ PASS vs Hashes.csv

Before moving to the next stage, I saved all my findings using nano command within report folder in REMnux VM:

9. Self-Reflection

I expected a straightforward hash check followed by a simple VirusTotal lookup. I came out of it having documented five distinct anti-analysis techniques, two complex persistence mechanisms, a commercial-grade packer, and a file infector that contaminates every executable on a system. That gap between expectation and reality is itself an overall lesson.

I learned that static analysis is not a simple preliminary step you deploy before the “real” analysis, it is a complete analytical approach to understand threat actor’s behaviour. Simple Threat Intelligence gathering gave me more information and tough me more than the real analytical analysis that i haven’t even deployed yet.

I discovered there is anti-analysis techniques which backs up what Cucci (2024) calls the adversarial relationship between malware authors and analysts. The sample is not just malicious but it is actively trying to prevent me from understanding it. Understanding their techniques would define analytical approach later on this lab.

Finally, one thing I would do differently is to query the DLL on JoeSandbox separately rather than focusing only on the EXE next time. The backdoor classification from AliCloud for the DLL was significant, and a dedicated sandboxes would likely reveal additional patterns to note.

10. References

Cucci, K. (2024). Evasive malware: A field guide to detecting, analyzing, and defeating advanced threats. No Starch Press.
Kaspersky. (n.d.). What is typosquatting? Kaspersky Resource Center. https://www.kaspersky.com/resource-center/definitions/what-is-typosquatting
Kleymenov, A., & Thabet, A. (2019). Mastering malware analysis: The complete malware analyst’s guide to combating malicious software, APT, cybercrime, and IoT attacks. Packt Publishing.
Muralidharan, T., Cohen, A., Hajaj, A., & Zilberman, P. (2022). File packing from the malware perspective: Techniques, analysis approaches, and directions for enhancements. ACM Computing Surveys, 55(5), Article 108. https://doi.org/10.1145/3530810
Sikorski, M., & Honig, A. (2012). Practical malware analysis: The hands-on guide to dissecting malicious software. No Starch Press.
Šťastná, J., & Tomášek, M. (2016). The problem of malware packing and its occurrence in harmless software. Acta Electrotechnica et Informatica, 16(3), 41–47. https://doi.org/10.15546/aeei-2016-0022
Trend Micro. (2021, January 29). Trojan.MSIL.AENJARIS.A. Trend Micro Threat Encyclopedia. https://www.trendmicro.com/vinfo/us/threat-encyclopedia/malware/trojan.msil.aenjaris.a
Ugarte-Pedrero, X., Balzarotti, D., Santos, I., & Bringas, P. G. (2016). RAMBO: Run-time packer analysis with multiple branch observation. In Proceedings of the 13th Conference on Detection of Intrusions and Malware & Vulnerability Assessment (DIMVA 2016). https://s3.eurecom.fr/docs/dimva16_multiunpack.pdf
Truong, M. T. (2023). Typosquatting attacks and mitigations (Bachelor’s thesis, Hochschule Bonn-Rhein-Sieg, University of Applied Sciences). https://assets-stage.accso.de/downloads/Bachelorarbeit-Minh-Truong_202303_Typosquatting-Attacks-and-Mitigations.pdf
Raymond Chen (2023). Why is kernel32.dll running in user mode and not kernel mode, like its name implies? The Old New Thing. Microsoft. https://devblogs.microsoft.com/oldnewthing/20230926-00/?p=108824
NIST. National Institute of Standards and Technology. (2016). Guide to cyber threat information sharing (NIST Special Publication 800-150). https://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.800-150.pdf
Sun, L., Versteeg, S., Boztas, S., & Yann, T. (2010). Pattern recognition techniques for the classification of malware packers. In R. Steinfeld & P. Hawkes (Eds.), Information Security and Privacy (ACISP 2010) (Lecture Notes in Computer Science, Vol. 6168, pp. 370–390). Springer. https://www.researchgate.net/publication/220798443_Pattern_Recognition_Techniques_for_the_Classification_of_Malware_Packers

Next entry: Lab 02c — SIEM integration and NAT simulation Previous entry: Lab 02a — Sample Acquisition