Structural File‑Format Exploit Detection (0‑Click Chains)

Reading time: 6 minutes

tip

Learn & practice AWS Hacking:HackTricks Training AWS Red Team Expert (ARTE)
Learn & practice GCP Hacking: HackTricks Training GCP Red Team Expert (GRTE)
Learn & practice Az Hacking: HackTricks Training Azure Red Team Expert (AzRTE)

Support HackTricks

This page summarizes practical techniques to detect 0‑click mobile exploit files by validating structural invariants of their formats instead of relying on byte signatures. The approach generalizes across samples, polymorphic variants, and future exploits that abuse the same parser logic.

Key idea: encode structural impossibilities and cross‑field inconsistencies that only appear when a vulnerable decoder/parser state is reached.

See also:

PDF File analysis

Why structure, not signatures

When weaponized samples are unavailable and payload bytes mutate, traditional IOC/YARA patterns fail. Structural detection inspects the container’s declared layout versus what is mathematically or semantically possible for the format implementation.

Typical checks:

  • Validate table sizes and bounds derived from the spec and safe implementations
  • Flag illegal/undocumented opcodes or state transitions in embedded bytecode
  • Cross‑check metadata VS actual encoded stream components
  • Detect contradictory fields that indicate parser confusion or integer overflow set‑ups

Below are concrete, field‑tested patterns for multiple high‑impact chains.


PDF/JBIG2 – FORCEDENTRY (CVE‑2021‑30860)

Target: JBIG2 symbol dictionaries embedded inside PDFs (often used in mobile MMS parsing).

Structural signals:

  • Contradictory dictionary state that cannot occur in benign content but is required to trigger the overflow in arithmetic decoding.
  • Suspicious use of global segments combined with abnormal symbol counts during refinement coding.

Pseudo‑logic:

pseudo
# Detecting impossible dictionary state used by FORCEDENTRY
if input_symbols_count == 0 and (ex_syms > 0 and ex_syms < 4):
    mark_malicious("JBIG2 impossible symbol dictionary state")

Practical triage:

  • Identify and extract JBIG2 streams from the PDF
    • pdfid/pdf-parser/peepdf to locate and dump streams
  • Verify arithmetic coding flags and symbol dictionary parameters against the JBIG2 spec

Notes:

  • Works without embedded payload signatures
  • Low FP in practice because the flagged state is mathematically inconsistent

WebP/VP8L – BLASTPASS (CVE‑2023‑4863)

Target: WebP lossless (VP8L) Huffman prefix‑code tables.

Structural signals:

  • Total size of constructed Huffman tables exceeds the safe upper bound expected by the reference/patched implementations, implying the overflow precondition.

Pseudo‑logic:

pseudo
# Detect malformed Huffman table construction triggering overflow
let total_size = sum(table_sizes)
if total_size > 2954:   # example bound: FIXED_TABLE_SIZE + MAX_TABLE_SIZE
    mark_malicious("VP8L oversized Huffman tables")

Practical triage:

  • Check WebP container chunks: VP8X + VP8L
  • Parse VP8L prefix codes and compute actual allocated table sizes

Notes:

  • Robust against byte‑level polymorphism of the payload
  • Bound is derived from upstream limits/patch analysis

TrueType – TRIANGULATION (CVE‑2023‑41990)

Target: TrueType bytecode inside fpgm/prep/glyf programs.

Structural signals:

  • Presence of undocumented/forbidden opcodes in Apple’s interpreter used by the exploit chain.

Pseudo‑logic:

pseudo
# Flag undocumented TrueType opcodes leveraged by TRIANGULATION
switch opcode:
  case 0x8F, 0x90:
    mark_malicious("Undocumented TrueType bytecode")
  default:
    continue

Practical triage:

  • Dump font tables (e.g., using fontTools/ttx) and scan fpgm/prep/glyf programs
  • No need to fully emulate the interpreter to get value from presence checks

Notes:

  • May produce rare FPs if nonstandard fonts include unknown opcodes; validate with secondary tooling

DNG/TIFF – CVE‑2025‑43300

Target: DNG/TIFF image metadata VS actual component count in encoded stream (e.g., JPEG‑Lossless SOF3).

Structural signals:

  • Inconsistency between EXIF/IFD fields (SamplesPerPixel, PhotometricInterpretation) and the component count parsed from the image stream header used by the pipeline.

Pseudo‑logic:

pseudo
# Metadata claims 2 samples per pixel but stream header exposes only 1 component
if samples_per_pixel == 2 and sof3_components == 1:
    mark_malicious("DNG/TIFF metadata vs. stream mismatch")

Practical triage:

  • Parse primary IFD and EXIF tags
  • Locate and parse the embedded JPEG‑Lossless header (SOF3) and compare component counts

Notes:

  • Reported exploited in the wild; excellent candidate for structural consistency checks

Implementation patterns and performance

A practical scanner should:

  • Auto‑detect file type and dispatch only relevant analyzers (PDF/JBIG2, WebP/VP8L, TTF, DNG/TIFF)
  • Stream/partial‑parse to minimize allocations and enable early termination
  • Run analyses in parallel (thread‑pool) for bulk triage

Example workflow with ElegantBouncer (open‑source Rust implementation of these checks):

bash
# Scan a path recursively with structural detectors
$ elegant-bouncer --scan /path/to/directory

# Optional TUI for parallel scanning and real‑time alerts
$ elegant-bouncer --tui --scan /path/to/samples

DFIR tips and edge cases

  • Embedded objects: PDFs may embed images (JBIG2) and fonts (TrueType); extract and recursively scan
  • Decompression safety: use libraries that hard‑limit tables/buffers before allocation
  • False positives: keep rules conservative, favor contradictions that are impossible under the spec
  • Version drift: re‑baseline bounds (e.g., VP8L table sizes) when upstream parsers change limits

  • ElegantBouncer – structural scanner for the detections above
  • pdfid/pdf-parser/peepdf – PDF object extraction and static analysis
  • pdfcpu – PDF linter/sanitizer
  • fontTools/ttx – dump TrueType tables and bytecode
  • exiftool – read TIFF/DNG/EXIF metadata
  • dwebp/webpmux – parse WebP metadata and chunks

References

tip

Learn & practice AWS Hacking:HackTricks Training AWS Red Team Expert (ARTE)
Learn & practice GCP Hacking: HackTricks Training GCP Red Team Expert (GRTE)
Learn & practice Az Hacking: HackTricks Training Azure Red Team Expert (AzRTE)

Support HackTricks