Homograph / Homoglyph Attacks in Phishing

Reading time: 5 minutes

tip

Jifunze na fanya mazoezi ya AWS Hacking:HackTricks Training AWS Red Team Expert (ARTE)
Jifunze na fanya mazoezi ya GCP Hacking: HackTricks Training GCP Red Team Expert (GRTE) Jifunze na fanya mazoezi ya Azure Hacking: HackTricks Training Azure Red Team Expert (AzRTE)

Support HackTricks

Overview

Shambulio la homograph (pia inajulikana kama homoglyph) linatumia ukweli kwamba mifumo mingi ya Unicode kutoka kwa maandiko yasiyo ya Kilatini ni sawa kabisa au yanafanana sana na wahusika wa ASCII. Kwa kubadilisha wahusika mmoja au zaidi wa Kilatini na wenzao wanaofanana, mshambuliaji anaweza kuunda:

  • Majina ya kuonyesha, mada au maudhui ya ujumbe yanayoonekana kuwa halali kwa jicho la binadamu lakini yanapita kwenye ugunduzi wa msingi wa maneno muhimu.
  • Majina ya kikoa, sub-kikoa au njia za URL ambazo zinawadanganya waathirika kuamini wanatembelea tovuti ya kuaminika.

Kwa sababu kila glyph inatambulika ndani kwa nambari ya Unicode, wahusika mmoja tu aliyebadilishwa inatosha kushinda kulinganisha nyuzi zisizo na busara (mfano, "Παypal.com" dhidi ya "Paypal.com").

Typical Phishing Workflow

  1. Craft message content – Badilisha herufi maalum za Kilatini katika chapa / neno muhimu linalojulikana na wahusika wasioonekana kutoka kwa maandiko mengine (Kigiriki, Kirusi, Kiarumeni, Cherokee, nk.).
  2. Register supporting infrastructure – Kurekebisha kikoa cha homoglyph na kupata cheti cha TLS (zaidi ya CAs hazifanyi ukaguzi wa kufanana kwa kuona).
  3. Send email / SMS – Ujumbe unajumuisha homoglyphs katika moja au zaidi ya maeneo yafuatayo:
  • Jina la kuonyesha la mtumaji (mfano, Ηеlрdеѕk)
  • Mstari wa mada (Urgеnt Аctіon Rеquіrеd)
  • Maandishi ya kiungo au jina kamili la kikoa
  1. Redirect chain – Mwathirika anarudishwa kupitia tovuti zinazonekana kuwa salama au wafupishaji wa URL kabla ya kutua kwenye mwenyeji mbaya anayekusanya akidi / kupeleka malware.

Unicode Ranges Commonly Abused

ScriptRangeExample glyphLooks like
GreekU+0370-03FFΗ (U+0397)Latin H
GreekU+0370-03FFρ (U+03C1)Latin p
CyrillicU+0400-04FFа (U+0430)Latin a
CyrillicU+0400-04FFе (U+0435)Latin e
ArmenianU+0530-058Fօ (U+0585)Latin o
CherokeeU+13A0-13FF (U+13A2)Latin T

Tip: Full Unicode charts are available at unicode.org.

Detection Techniques

1. Mixed-Script Inspection

Barua pepe za phishing zinazolenga shirika linalozungumza Kiingereza zinapaswa nadra kuchanganya wahusika kutoka kwa mifumo mingi. Heuristics rahisi lakini yenye ufanisi ni:

  1. Pitia kila wahusika wa nyuzi inayokaguliwa.
  2. Ramani ya nambari ya nambari kwa kizuizi chake cha Unicode.
  3. Pandisha arifa ikiwa mifumo zaidi ya mmoja ipo au ikiwa mifumo isiyo ya Kilatini inaonekana mahali ambapo haitarajiwi (jina la kuonyesha, kikoa, mada, URL, nk.).

Python proof-of-concept:

python
import unicodedata as ud
from collections import defaultdict

SUSPECT_FIELDS = {
"display_name": "Ηоmоgraph Illusion",     # example data
"subject": "Finаnꮯiаl Տtatеmеnt",
"url": "https://xn--messageconnecton-2kb.blob.core.windows.net"  # punycode
}

for field, value in SUSPECT_FIELDS.items():
blocks = defaultdict(int)
for ch in value:
if ch.isascii():
blocks['Latin'] += 1
else:
name = ud.name(ch, 'UNKNOWN')
block = name.split(' ')[0]     # e.g., 'CYRILLIC'
blocks[block] += 1
if len(blocks) > 1:
print(f"[!] Mixed scripts in {field}: {dict(blocks)} -> {value}")

2. Punycode Normalisation (Domains)

Majina ya Kikoa ya Kimataifa (IDNs) yanakodishwa kwa punycode (xn--). Kubadilisha kila jina la mwenyeji kuwa punycode na kisha kurudi kwenye Unicode kunaruhusu kulinganisha dhidi ya orodha ya kibali au kufanya ukaguzi wa kufanana (kwa mfano, umbali wa Levenshtein) baada ya mfuatano kuwa wa kawaida.

python
import idna
hostname = "Ρаypal.com"   # Greek Rho + Cyrillic a
puny = idna.encode(hostname).decode()
print(puny)  # xn--yl8hpyal.com

3. Kamusi za Homoglyph / Algorithms

Tools such as dnstwist (--homoglyph) or urlcrazy can enumerate visually-similar domain permutations and are useful for proactive takedown / monitoring.

Kuzuia & Kupunguza

  • Tekeleza sera kali za DMARC/DKIM/SPF – zuia uongo kutoka kwa maeneo yasiyoidhinishwa.
  • Tekeleza mantiki ya kugundua hapo juu katika Secure Email Gateways na SIEM/XSOAR playbooks.
  • Flag au karantini ujumbe ambapo jina la kuonyesha domain ≠ domain ya mtumaji.
  • Elimisha watumiaji: nakala-paste maandiko ya shaka kwenye mkaguzi wa Unicode, piga juu ya viungo, kamwe usiamini URL shorteners.

Mifano ya Uhalisia

  • Jina la kuonyesha: Сonfidеntiаl Ꭲiꮯkеt (Cyrillic С, е, а; Cherokee ; Latin small capital ).
  • Mnyororo wa domain: bestseoservices.com ➜ municipal /templates directory ➜ kig.skyvaulyt.ru ➜ fake Microsoft login at mlcorsftpsswddprotcct.approaches.it.com protected by custom OTP CAPTCHA.
  • Spotify impersonation: Sρօtifւ sender with link hidden behind redirects.ca.

These samples originate from Unit 42 research (July 2025) and illustrate how homograph abuse is combined with URL redirection and CAPTCHA evasion to bypass automated analysis.

Marejeo

tip

Jifunze na fanya mazoezi ya AWS Hacking:HackTricks Training AWS Red Team Expert (ARTE)
Jifunze na fanya mazoezi ya GCP Hacking: HackTricks Training GCP Red Team Expert (GRTE) Jifunze na fanya mazoezi ya Azure Hacking: HackTricks Training Azure Red Team Expert (AzRTE)

Support HackTricks