Malware Analysis
Reading time: 14 minutes
tip
Learn & practice AWS Hacking:HackTricks Training AWS Red Team Expert (ARTE)
Learn & practice GCP Hacking: HackTricks Training GCP Red Team Expert (GRTE)
Learn & practice Az Hacking: HackTricks Training Azure Red Team Expert (AzRTE)
Support HackTricks
- Check the subscription plans!
- Join the 💬 Discord group or the telegram group or follow us on Twitter 🐦 @hacktricks_live.
- Share hacking tricks by submitting PRs to the HackTricks and HackTricks Cloud github repos.
Forensics CheatSheets
https://www.jaiminton.com/cheatsheet/DFIR/#
Online Services
Offline Antivirus and Detection Tools
Yara
Install
sudo apt-get install -y yara
Prepare rules
Use this script to download and merge all the yara malware rules from github: https://gist.github.com/andreafortuna/29c6ea48adf3d45a979a78763cdc7ce9
Create the rules directory and execute it. This will create a file called malware_rules.yar which contains all the yara rules for malware.
wget https://gist.githubusercontent.com/andreafortuna/29c6ea48adf3d45a979a78763cdc7ce9/raw/4ec711d37f1b428b63bed1f786b26a0654aa2f31/malware_yara_rules.py
mkdir rules
python malware_yara_rules.py
Scan
yara -w malware_rules.yar image #Scan 1 file
yara -w malware_rules.yar folder #Scan the whole folder
YaraGen: Check for malware and Create rules
You can use the tool YaraGen to generate yara rules from a binary. Check out these tutorials: Part 1, Part 2, Part 3
python3 yarGen.py --update
python3.exe yarGen.py --excludegood -m ../../mals/
ClamAV
Install
sudo apt-get install -y clamav
Scan
sudo freshclam #Update rules
clamscan filepath #Scan 1 file
clamscan folderpath #Scan the whole folder
Capa
Capa detects potentially malicious capabilities in executables: PE, ELF, .NET. So it will find things such as Att&ck tactics, or suspicious capabilities such as:
- check for OutputDebugString error
- run as a service
- create process
Get it int he Github repo.
IOCs
IOC means Indicator Of Compromise. An IOC is a set of conditions that identify some potentially unwanted software or confirmed malware. Blue Teams use this kind of definition to search for this kind of malicious files in their systems and networks.
To share these definitions is very useful as when malware is identified in a computer and an IOC for that malware is created, other Blue Teams can use it to identify the malware faster.
A tool to create or modify IOCs is IOC Editor.
You can use tools such as Redline to search for defined IOCs in a device.
Loki
Loki is a scanner for Simple Indicators of Compromise.
Detection is based on four detection methods:
1. File Name IOC
Regex match on full file path/name
2. Yara Rule Check
Yara signature matches on file data and process memory
3. Hash Check
Compares known malicious hashes (MD5, SHA1, SHA256) with scanned files
4. C2 Back Connect Check
Compares process connection endpoints with C2 IOCs (new since version v.10)
Linux Malware Detect
Linux Malware Detect (LMD) is a malware scanner for Linux released under the GNU GPLv2 license, that is designed around the threats faced in shared hosted environments. It uses threat data from network edge intrusion detection systems to extract malware that is actively being used in attacks and generates signatures for detection. In addition, threat data is also derived from user submissions with the LMD checkout feature and malware community resources.
rkhunter
Tools like rkhunter can be used to check the filesystem for possible rootkits and malware.
sudo ./rkhunter --check -r / -l /tmp/rkhunter.log [--report-warnings-only] [--skip-keypress]
FLOSS
FLOSS is a tool that will try to find obfuscated strings inside executables using different techniques.
PEpper
PEpper checks some basic stuff inside the executable (binary data, entropy, URLs and IPs, some yara rules).
PEstudio
PEstudio is a tool that allows to get information of Windows executables such as imports, exports, headers, but also will check virus total and find potential Att&ck techniques.
Detect It Easy(DiE)
DiE is a tool to detect if a file is encrypted and also find packers.
NeoPI
NeoPI is a Python script that uses a variety of statistical methods to detect obfuscated and encrypted content within text/script files. The intended purpose of NeoPI is to aid in the detection of hidden web shell code.
php-malware-finder
PHP-malware-finder does its very best to detect obfuscated/dodgy code as well as files using PHP functions often used in malwares/webshells.
Apple Binary Signatures
When checking some malware sample you should always check the signature of the binary as the developer that signed it may be already related with malware.
#Get signer
codesign -vv -d /bin/ls 2>&1 | grep -E "Authority|TeamIdentifier"
#Check if the app’s contents have been modified
codesign --verify --verbose /Applications/Safari.app
#Check if the signature is valid
spctl --assess --verbose /Applications/Safari.app
Detection Techniques
File Stacking
If you know that some folder containing the files of a web server was last updated on some date. Check the date all the files in the web server were created and modified and if any date is suspicious, check that file.
Baselines
If the files of a folder shouldn't have been modified, you can calculate the hash of the original files of the folder and compare them with the current ones. Anything modified will be suspicious.
Statistical Analysis
When the information is saved in logs you can check statistics like how many times each file of a web server was accessed as a web shell might be one of the most.
Android in-app native telemetry (no root)
On Android, you can instrument native code inside the target app process by preloading a tiny logger library before other JNI libs initialize. This gives early visibility into native behavior without system-wide hooks or root. A popular approach is SoTap: drop libsotap.so for the right ABI into the APK and inject a System.loadLibrary("sotap") call early (e.g., static initializer or Application.onCreate), then collect logs from internal/external paths or Logcat fallback.
See the Android native reversing page for setup details and log paths:
Android/JNI native string deobfuscation with angr + Ghidra
Some Android malware and RASP-protected apps hide JNI method names and signatures by decoding them at runtime before calling RegisterNatives. When Frida/ptrace instrumentation is killed by anti-debug, you can still recover the plaintext offline by executing the in-binary decoder with angr and then pushing results back into Ghidra as comments.
Key idea: treat the decoder inside the .so as a callable function, execute it on the obfuscated byte blobs in .rodata, and concretize the output bytes up to the first \x00 (C-string terminator). Keep angr and Ghidra using the same image base to avoid address mismatches.
Workflow overview
- Triage in Ghidra: identify the decoder and its calling convention/arguments in JNI_OnLoad and RegisterNatives setup.
- Run angr (CPython3) to execute the decoder for each target string and dump results.
- Annotate in Ghidra: auto-comment decoded strings at each call site for fast JNI reconstruction.
Ghidra triage (JNI_OnLoad pattern)
-
Apply JNI datatypes to JNI_OnLoad so Ghidra recognises JNINativeMethod structures.
-
Typical JNINativeMethod per Oracle docs:
typedef struct { char *name; // e.g., "nativeFoo" char *signature; // e.g., "()V", "()[B" void *fnPtr; // native implementation address } JNINativeMethod;
-
Look for calls to RegisterNatives. If the library constructs the name/signature with a local routine (e.g., FUN_00100e10) that references a static byte table (e.g., DAT_00100bf4) and takes parameters like (encoded_ptr, out_buf, length), that is an ideal target for offline execution.
angr setup (execute the decoder offline)
-
Load the .so with the same base used in Ghidra (example: 0x00100000) and disable auto-loading of external libs to keep the state small.
import angr, json project = angr.Project( '/path/to/libtarget.so', load_options={'main_opts': {'base_addr': 0x00100000}}, auto_load_libs=False, ) ENCODING_FUNC_ADDR = 0x00100e10 # decoder function discovered in Ghidra def decode_string(enc_addr, length): # fresh blank state per evaluation st = project.factory.blank_state() outbuf = st.heap.allocate(length) call = project.factory.callable(ENCODING_FUNC_ADDR, base_state=st) ret_ptr = call(enc_addr, outbuf, length) # returns outbuf pointer rs = call.result_state raw = rs.solver.eval(rs.memory.load(ret_ptr, length), cast_to=bytes) return raw.split(b'\x00', 1)[0].decode('utf-8', errors='ignore') # Example: decode a JNI signature at 0x100933 of length 5 → should be ()[B print(decode_string(0x00100933, 5))
-
At scale, build a static map of call sites to the decoder’s arguments (encoded_ptr, size). Wrappers may hide arguments, so you may create this mapping manually from Ghidra xrefs if API recovery is noisy.
# call_site -> (encoded_addr, size) call_site_args_map = { 0x00100f8c: (0x00100b81, 0x41), 0x00100fa8: (0x00100bca, 0x04), 0x00100fcc: (0x001007a0, 0x41), 0x00100fe8: (0x00100933, 0x05), 0x0010100c: (0x00100c62, 0x41), 0x00101028: (0x00100c15, 0x16), 0x00101050: (0x00100a49, 0x101), 0x00100cf4: (0x00100821, 0x11), 0x00101170: (0x00100940, 0x101), 0x001011cc: (0x0010084e, 0x13), 0x00101334: (0x001007e9, 0x0f), 0x00101478: (0x0010087d, 0x15), 0x001014f8: (0x00100800, 0x19), 0x001015e8: (0x001008e6, 0x27), 0x0010160c: (0x00100c33, 0x13), } decoded_map = { hex(cs): decode_string(enc, sz) for cs, (enc, sz) in call_site_args_map.items() } print(json.dumps(decoded_map, indent=2)) with open('decoded_strings.json', 'w') as f: json.dump(decoded_map, f, indent=2)
Annotate call sites in Ghidra Option A: Jython-only comment writer (use a pre-computed JSON)
-
Since angr requires CPython3, keep deobfuscation and annotation separated. First run the angr script above to produce decoded_strings.json. Then run this Jython GhidraScript to write PRE_COMMENTs at each call site (and include the caller function name for context):
#@category Android/Deobfuscation # Jython in Ghidra 10/11 import json from ghidra.program.model.listing import CodeUnit # Ask for the JSON produced by the angr script f = askFile('Select decoded_strings.json', 'Load') mapping = json.load(open(f.absolutePath, 'r')) # keys as hex strings fm = currentProgram.getFunctionManager() rm = currentProgram.getReferenceManager() # Replace with your decoder address to locate call-xrefs (optional) ENCODING_FUNC_ADDR = 0x00100e10 enc_addr = toAddr(ENCODING_FUNC_ADDR) callsite_to_fn = {} for ref in rm.getReferencesTo(enc_addr): if ref.getReferenceType().isCall(): from_addr = ref.getFromAddress() fn = fm.getFunctionContaining(from_addr) if fn: callsite_to_fn[from_addr.getOffset()] = fn.getName() # Write comments from JSON for k_hex, s in mapping.items(): cs = int(k_hex, 16) site = toAddr(cs) caller = callsite_to_fn.get(cs, None) text = s if caller is None else '%s @ %s' % (s, caller) currentProgram.getListing().setComment(site, CodeUnit.PRE_COMMENT, text) print('[+] Annotated %d call sites' % len(mapping))
Option B: Single CPython script via pyhidra/ghidra_bridge
- Alternatively, use pyhidra or ghidra_bridge to drive Ghidra’s API from the same CPython process running angr. This allows calling decode_string() and immediately setting PRE_COMMENTs without an intermediate file. The logic mirrors the Jython script: build callsite→function map via ReferenceManager, decode with angr, and set comments.
Why this works and when to use it
- Offline execution sidesteps RASP/anti-debug: no ptrace, no Frida hooks required to recover strings.
- Keeping Ghidra and angr base_addr aligned (e.g., 0x00100000) ensures that function/data addresses match across tools.
- Repeatable recipe for decoders: treat the transform as a pure function, allocate an output buffer in a fresh state, call it with (encoded_ptr, out_ptr, len), then concretize via state.solver.eval and parse C-strings up to \x00.
Notes and pitfalls
- Respect the target ABI/calling convention. angr.factory.callable picks one based on arch; if arguments look shifted, specify cc explicitly.
- If the decoder expects zeroed output buffers, initialize outbuf with zeros in the state before the call.
- For position-independent Android .so, always supply base_addr so addresses in angr match those seen in Ghidra.
- Use currentProgram.getReferenceManager() to enumerate call-xrefs even if the app wraps the decoder behind thin stubs.
For angr basics, see: angr basics
Deobfuscating Dynamic Control-Flow (JMP/CALL RAX Dispatchers)
Modern malware families heavily abuse Control-Flow Graph (CFG) obfuscation: instead of a direct jump/call they compute the destination at run-time and execute a jmp rax
or call rax
. A small dispatcher (typically nine instructions) sets the final target depending on the CPU ZF
/CF
flags, completely breaking static CFG recovery.
The technique – showcased by the SLOW#TEMPEST loader – can be defeated with a three-step workflow that only relies on IDAPython and the Unicorn CPU emulator.
1. Locate every indirect jump / call
import idautils, idc
for ea in idautils.FunctionItems(idc.here()):
mnem = idc.print_insn_mnem(ea)
if mnem in ("jmp", "call") and idc.print_operand(ea, 0) == "rax":
print(f"[+] Dispatcher found @ {ea:X}")
2. Extract the dispatcher byte-code
import idc
def get_dispatcher_start(jmp_ea, count=9):
s = jmp_ea
for _ in range(count):
s = idc.prev_head(s, 0)
return s
start = get_dispatcher_start(jmp_ea)
size = jmp_ea + idc.get_item_size(jmp_ea) - start
code = idc.get_bytes(start, size)
open(f"{start:X}.bin", "wb").write(code)
3. Emulate it twice with Unicorn
from unicorn import *
from unicorn.x86_const import *
import struct
def run(code, zf=0, cf=0):
BASE = 0x1000
mu = Uc(UC_ARCH_X86, UC_MODE_64)
mu.mem_map(BASE, 0x1000)
mu.mem_write(BASE, code)
mu.reg_write(UC_X86_REG_RFLAGS, (zf << 6) | cf)
mu.reg_write(UC_X86_REG_RAX, 0)
mu.emu_start(BASE, BASE+len(code))
return mu.reg_read(UC_X86_REG_RAX)
Run run(code,0,0)
and run(code,1,1)
to obtain the false and true branch targets.
4. Patch back a direct jump / call
import struct, ida_bytes
def patch_direct(ea, target, is_call=False):
op = 0xE8 if is_call else 0xE9 # CALL rel32 or JMP rel32
disp = target - (ea + 5) & 0xFFFFFFFF
ida_bytes.patch_bytes(ea, bytes([op]) + struct.pack('<I', disp))
After patching, force IDA to re-analyse the function so the full CFG and Hex-Rays output are restored:
import ida_auto, idaapi
idaapi.reanalyze_function(idc.get_func_attr(ea, idc.FUNCATTR_START))
5. Label indirect API calls
Once the real destination of every call rax
is known you can tell IDA what it is so parameter types & variable names are recovered automatically:
idc.set_callee_name(call_ea, resolved_addr, 0) # IDA 8.3+
Practical benefits
- Restores the real CFG → decompilation goes from 10 lines to thousands.
- Enables string-cross-reference & xrefs, making behaviour reconstruction trivial.
- Scripts are reusable: drop them into any loader protected by the same trick.
AdaptixC2: Configuration Extraction and TTPs
See the dedicated page:
Adaptixc2 Config Extraction And Ttps
References
- Unit42 – Evolving Tactics of SLOW#TEMPEST: A Deep Dive Into Advanced Malware Techniques
- SoTap: Lightweight in-app JNI (.so) behavior logger – github.com/RezaArbabBot/SoTap
- Strategies for Analyzing Native Code in Android Applications: Combining Ghidra and Symbolic Execution for Code Decryption and Deobfuscation – revflash.medium.com
- Ghidra – github.com/NationalSecurityAgency/ghidra
- angr – angr.io
- JNI_OnLoad and invocation API – docs.oracle.com
- RegisterNatives – docs.oracle.com
- Tracing JNI Functions – valsamaras.medium.com
- Native Enrich: Scripting Ghidra and Frida to discover hidden JNI functions – laripping.com
- Unit42 – AdaptixC2: A New Open-Source Framework Leveraged in Real-World Attacks
tip
Learn & practice AWS Hacking:HackTricks Training AWS Red Team Expert (ARTE)
Learn & practice GCP Hacking: HackTricks Training GCP Red Team Expert (GRTE)
Learn & practice Az Hacking: HackTricks Training Azure Red Team Expert (AzRTE)
Support HackTricks
- Check the subscription plans!
- Join the 💬 Discord group or the telegram group or follow us on Twitter 🐦 @hacktricks_live.
- Share hacking tricks by submitting PRs to the HackTricks and HackTricks Cloud github repos.