Manual De-obfuscation Techniques
Reading time: 7 minutes
tip
Learn & practice AWS Hacking:HackTricks Training AWS Red Team Expert (ARTE)
Learn & practice GCP Hacking: HackTricks Training GCP Red Team Expert (GRTE)
Learn & practice Az Hacking: HackTricks Training Azure Red Team Expert (AzRTE)
Support HackTricks
- Check the subscription plans!
- Join the 💬 Discord group or the telegram group or follow us on Twitter 🐦 @hacktricks_live.
- Share hacking tricks by submitting PRs to the HackTricks and HackTricks Cloud github repos.
Manual De-obfuscation Techniques
In the realm of software security, the process of making obscured code understandable, known as de-obfuscation, is crucial. This guide delves into various strategies for de-obfuscation, focusing on static analysis techniques and recognizing obfuscation patterns. Additionally, it introduces an exercise for practical application and suggests further resources for those interested in exploring more advanced topics.
Strategies for Static De-obfuscation
When dealing with obfuscated code, several strategies can be employed depending on the nature of the obfuscation:
- DEX bytecode (Java): One effective approach involves identifying the application's de-obfuscation methods, then replicating these methods in a Java file. This file is executed to reverse the obfuscation on the targeted elements.
- Java and Native Code: Another method is to translate the de-obfuscation algorithm into a scripting language like Python. This strategy highlights that the primary goal is not to fully understand the algorithm but to execute it effectively.
Identifying Obfuscation
Recognizing obfuscated code is the first step in the de-obfuscation process. Key indicators include:
- The absence or scrambling of strings in Java and Android, which may suggest string obfuscation.
- The presence of binary files in the assets directory or calls to
DexClassLoader
, hinting at code unpacking and dynamic loading. - The use of native libraries alongside unidentifiable JNI functions, indicating potential obfuscation of native methods.
Dynamic Analysis in De-obfuscation
By executing the code in a controlled environment, dynamic analysis allows for the observation of how the obfuscated code behaves in real time. This method is particularly effective in uncovering the inner workings of complex obfuscation patterns that are designed to hide the true intent of the code.
Applications of Dynamic Analysis
- Runtime Decryption: Many obfuscation techniques involve encrypting strings or code segments that only get decrypted at runtime. Through dynamic analysis, these encrypted elements can be captured at the moment of decryption, revealing their true form.
- Identifying Obfuscation Techniques: By monitoring the application's behavior, dynamic analysis can help identify specific obfuscation techniques being used, such as code virtualization, packers, or dynamic code generation.
- Uncovering Hidden Functionality: Obfuscated code may contain hidden functionalities that are not apparent through static analysis alone. Dynamic analysis allows for the observation of all code paths, including those conditionally executed, to uncover such hidden functionalities.
Automated De-obfuscation with LLMs (Androidmeda)
While the previous sections focus on fully manual strategies, in 2025 a new class of Large-Language-Model (LLM) powered tooling emerged that can automate most of the tedious renaming and control-flow recovery work.
One representative project is Androidmeda – a Python utility that takes decompiled Java sources (e.g. produced by jadx
) and returns a greatly cleaned-up, commented and security-annotated version of the code.
Key capabilities
- Renames meaningless identifiers generated by ProGuard / DexGuard / DashO / Allatori / … to semantic names.
- Detects and restructures control-flow flattening, replacing opaque switch-case state machines with normal loops / if-else constructs.
- Decrypts common string encryption patterns when possible.
- Injects inline comments that explain the purpose of complex blocks.
- Performs a lightweight static security scan and writes the findings to
vuln_report.json
with severity levels (informational → critical).
Installation
git clone https://github.com/In3tinct/Androidmeda
cd Androidmeda
pip3 install -r requirements.txt
Preparing the inputs
- Decompile the target APK with
jadx
(or any other decompiler) and keep only the source directory that contains the.java
files:jadx -d input_dir/ target.apk
- (Optional) Trim
input_dir/
so that it only contains the application packages you want to analyse – this massively speeds-up processing and LLM costs.
Usage examples
Remote provider (Gemini-1.5-flash):
export OPENAI_API_KEY=<your_key>
python3 androidmeda.py \
--llm_provider google \
--llm_model gemini-1.5-flash \
--source_dir input_dir/ \
--output_dir out/ \
--save_code true
Offline (local ollama
backend with llama3.2):
python3 androidmeda.py \
--llm_provider ollama \
--llm_model llama3.2 \
--source_dir input_dir/ \
--output_dir out/ \
--save_code true
Output
out/vuln_report.json
– JSON array withfile
,line
,issue
,severity
.- A mirrored package tree with de-obfuscated
.java
files (only if--save_code true
).
Tips & troubleshooting
- Skipped class ⇒ usually caused by an unparsable method; isolate the package or update the parser regex.
- Slow run-time / high token usage ⇒ point
--source_dir
to specific app packages instead of the entire decompile. - Always manually review the vulnerability report – LLM hallucinations can lead to false positives / negatives.
Practical value – Crocodilus malware case study
Feeding a heavily obfuscated sample from the 2025 Crocodilus banking trojan through Androidmeda reduced analysis time from hours to minutes: the tool recovered call-graph semantics, revealed calls to accessibility APIs and hard-coded C2 URLs, and produced a concise report that could be imported into analysts’ dashboards.
References and Further Reading
-
BlackHat USA 2018: “Unpacking the Packed Unpacker: Reverse Engineering an Android Anti-Analysis Library” [video]
- This talk goes over reverse engineering one of the most complex anti-analysis native libraries I’ve seen used by an Android application. It covers mostly obfuscation techniques in native code.
-
REcon 2019: “The Path to the Payload: Android Edition” [video]
- This talk discusses a series of obfuscation techniques, solely in Java code, that an Android botnet was using to hide its behavior.
-
Deobfuscating Android Apps with Androidmeda (blog post) – mobile-hacker.com
-
Androidmeda source code – https://github.com/In3tinct/Androidmeda
-
BlackHat USA 2018: “Unpacking the Packed Unpacker: Reverse Engineering an Android Anti-Analysis Library” [video]
- This talk goes over reverse engineering one of the most complex anti-analysis native libraries I’ve seen used by an Android application. It covers mostly obfuscation techniques in native code.
-
REcon 2019: “The Path to the Payload: Android Edition” [video]
- This talk discusses a series of obfuscation techniques, solely in Java code, that an Android botnet was using to hide its behavior.
tip
Learn & practice AWS Hacking:HackTricks Training AWS Red Team Expert (ARTE)
Learn & practice GCP Hacking: HackTricks Training GCP Red Team Expert (GRTE)
Learn & practice Az Hacking: HackTricks Training Azure Red Team Expert (AzRTE)
Support HackTricks
- Check the subscription plans!
- Join the 💬 Discord group or the telegram group or follow us on Twitter 🐦 @hacktricks_live.
- Share hacking tricks by submitting PRs to the HackTricks and HackTricks Cloud github repos.