Bins & Memory Allocations

Reading time: 27 minutes

tip

学习和实践 AWS 黑客技术：HackTricks Training AWS Red Team Expert (ARTE)
学习和实践 GCP 黑客技术：HackTricks Training GCP Red Team Expert (GRTE) 学习和实践 Azure 黑客技术：HackTricks Training Azure Red Team Expert (AzRTE)

支持 HackTricks

查看 订阅计划!
加入 💬 Discord 群组 或 Telegram 群组 或在 Twitter 🐦 上关注我们 @hacktricks_live.
通过向 HackTricks 和 HackTricks Cloud GitHub 仓库提交 PR 来分享黑客技巧。

基本信息

为了提高块存储的效率，每个块不仅在一个链表中，而是有几种类型。这些是 bins，有 5 种类型的 bins：62 小 bins，63 大 bins，1 个未排序 bin，10 个快速 bins 和每个线程 64 个 tcache bins。

每个未排序、小型和大型 bins 的初始地址在同一个数组内。索引 0 未使用，1 是未排序 bin，bins 2-64 是小 bins，bins 65-127 是大 bins。

Tcache（每线程缓存）Bins

尽管线程尝试拥有自己的堆（参见 Arenas 和 Subheaps），但有可能一个有很多线程的进程（如 web 服务器）会与其他线程共享堆。在这种情况下，主要解决方案是使用锁，这可能会 显著减慢线程的速度。

因此，tcache 类似于每个线程的快速 bin，因为它是一个 单链表，不合并块。每个线程有 64 个单链表 tcache bins。每个 bin 最多可以有 7 个相同大小的块，大小范围为 24 到 1032B 在 64 位系统上和 12 到 516B 在 32 位系统上。

当一个线程释放一个块时，如果它不太大以至于无法在 tcache 中分配，并且相应的 tcache bin 没有满（已经有 7 个块），它将被分配到那里。如果无法进入 tcache，它将需要等待堆锁才能在全局范围内执行释放操作。

当 分配一个块时，如果在 Tcache 中有一个所需大小的空闲块，它将使用它，如果没有，它将需要等待堆锁才能在全局 bins 中找到一个或创建一个新的。
还有一个优化，在这种情况下，当拥有堆锁时，线程 将用请求大小的堆块填充他的 Tcache（7 个），以便在需要更多时，可以在 Tcache 中找到它们。

添加一个 tcache 块示例

#include <stdlib.h>
#include <stdio.h>

int main(void)
{
char *chunk;
chunk = malloc(24);
printf("Address of the chunk: %p\n", (void *)chunk);
gets(chunk);
free(chunk);
return 0;
}

将其编译并在主函数的 ret 操作码处设置断点进行调试。然后使用 gef，您可以看到正在使用的 tcache bin：

bash

gef➤  heap bins
──────────────────────────────────────────────────────────────────────────────── Tcachebins for thread 1 ────────────────────────────────────────────────────────────────────────────────
Tcachebins[idx=0, size=0x20, count=1] ←  Chunk(addr=0xaaaaaaac12a0, size=0x20, flags=PREV_INUSE | IS_MMAPPED | NON_MAIN_ARENA)

Tcache 结构和函数

在以下代码中，可以看到 max bins 和 chunks per index，为了避免双重释放而创建的 tcache_entry 结构，以及每个线程用来存储每个 bin 索引地址的 tcache_perthread_struct 结构。

tcache_entry 和 tcache_perthread_struct

// From https://github.com/bminor/glibc/blob/f942a732d37a96217ef828116ebe64a644db18d7/malloc/malloc.c

/* We want 64 entries.  This is an arbitrary limit, which tunables can reduce.  */
# define TCACHE_MAX_BINS		64
# define MAX_TCACHE_SIZE	tidx2usize (TCACHE_MAX_BINS-1)

/* Only used to pre-fill the tunables.  */
# define tidx2usize(idx)	(((size_t) idx) * MALLOC_ALIGNMENT + MINSIZE - SIZE_SZ)

/* When "x" is from chunksize().  */
# define csize2tidx(x) (((x) - MINSIZE + MALLOC_ALIGNMENT - 1) / MALLOC_ALIGNMENT)
/* When "x" is a user-provided size.  */
# define usize2tidx(x) csize2tidx (request2size (x))

/* With rounding and alignment, the bins are...
idx 0   bytes 0..24 (64-bit) or 0..12 (32-bit)
idx 1   bytes 25..40 or 13..20
idx 2   bytes 41..56 or 21..28
etc.  */

/* This is another arbitrary limit, which tunables can change.  Each
tcache bin will hold at most this number of chunks.  */
# define TCACHE_FILL_COUNT 7

/* Maximum chunks in tcache bins for tunables.  This value must fit the range
of tcache->counts[] entries, else they may overflow.  */
# define MAX_TCACHE_COUNT UINT16_MAX

[...]

typedef struct tcache_entry
{
struct tcache_entry *next;
/* This field exists to detect double frees.  */
uintptr_t key;
} tcache_entry;

/* There is one of these for each thread, which contains the
per-thread cache (hence "tcache_perthread_struct").  Keeping
overall size low is mildly important.  Note that COUNTS and ENTRIES
are redundant (we could have just counted the linked list each
time), this is for performance reasons.  */
typedef struct tcache_perthread_struct
{
uint16_t counts[TCACHE_MAX_BINS];
tcache_entry *entries[TCACHE_MAX_BINS];
} tcache_perthread_struct;

函数 __tcache_init 是创建和分配 tcache_perthread_struct 对象空间的函数。

tcache_init 代码

// From https://github.com/bminor/glibc/blob/f942a732d37a96217ef828116ebe64a644db18d7/malloc/malloc.c#L3241C1-L3274C2

static void
tcache_init(void)
{
mstate ar_ptr;
void *victim = 0;
const size_t bytes = sizeof (tcache_perthread_struct);

if (tcache_shutting_down)
return;

arena_get (ar_ptr, bytes);
victim = _int_malloc (ar_ptr, bytes);
if (!victim && ar_ptr != NULL)
{
ar_ptr = arena_get_retry (ar_ptr, bytes);
victim = _int_malloc (ar_ptr, bytes);
}


if (ar_ptr != NULL)
__libc_lock_unlock (ar_ptr->mutex);

/* In a low memory situation, we may not be able to allocate memory
- in which case, we just keep trying later.  However, we
typically do this very early, so either there is sufficient
memory, or there isn't enough memory to do non-trivial
allocations anyway.  */
if (victim)
{
tcache = (tcache_perthread_struct *) victim;
memset (tcache, 0, sizeof (tcache_perthread_struct));
}

}

Tcache 索引

Tcache 有几个 bins，具体取决于大小和指向 每个索引第一个块的初始指针以及每个索引的块数量位于一个块内部。这意味着通过定位包含这些信息的块（通常是第一个），可以找到所有 tcache 初始点和 Tcache 块的数量。

快速 bins

快速 bins 旨在 加速小块的内存分配，通过将最近释放的块保存在快速访问结构中。这些 bins 使用后进先出（LIFO）方法，这意味着 最近释放的块是第一个 在有新的分配请求时被重用。这种行为在速度上是有利的，因为从栈顶（LIFO）插入和移除比从队列（FIFO）更快。

此外，快速 bins 使用单链表，而不是双链表，这进一步提高了速度。由于快速 bins 中的块不会与邻居合并，因此不需要复杂的结构来允许从中间移除。单链表在这些操作中更简单、更快。

基本上，这里发生的事情是，头部（指向第一个块的指针）始终指向该大小的最新释放块。因此：

当分配一个该大小的新块时，头部指向一个可用的空闲块。由于这个空闲块指向下一个可用块，这个地址被存储在头部，以便下一个分配知道在哪里获取可用块
当一个块被释放时，空闲块将保存当前可用块的地址，而这个新释放块的地址将放入头部

链表的最大大小为 0x80，它们的组织方式是，大小为 0x20 的块将位于索引 0，大小为 0x30 的块将位于索引 1...

caution

快速 bins 中的块未设置为可用，因此它们会在一段时间内保持为快速 bin 块，而不是能够与周围的其他空闲块合并。

// From https://github.com/bminor/glibc/blob/a07e000e82cb71238259e674529c37c12dc7d423/malloc/malloc.c#L1711

/*
Fastbins

An array of lists holding recently freed small chunks.  Fastbins
are not doubly linked.  It is faster to single-link them, and
since chunks are never removed from the middles of these lists,
double linking is not necessary. Also, unlike regular bins, they
are not even processed in FIFO order (they use faster LIFO) since
ordering doesn't much matter in the transient contexts in which
fastbins are normally used.

Chunks in fastbins keep their inuse bit set, so they cannot
be consolidated with other free chunks. malloc_consolidate
releases all chunks in fastbins and consolidates them with
other free chunks.
*/

typedef struct malloc_chunk *mfastbinptr;
#define fastbin(ar_ptr, idx) ((ar_ptr)->fastbinsY[idx])

/* offset 2 to use otherwise unindexable first 2 bins */
#define fastbin_index(sz) \
((((unsigned int) (sz)) >> (SIZE_SZ == 8 ? 4 : 3)) - 2)


/* The maximum fastbin request size we support */
#define MAX_FAST_SIZE     (80 * SIZE_SZ / 4)

#define NFASTBINS  (fastbin_index (request2size (MAX_FAST_SIZE)) + 1)

添加一个 fastbin 块示例

#include <stdlib.h>
#include <stdio.h>

int main(void)
{
char *chunks[8];
int i;

// Loop to allocate memory 8 times
for (i = 0; i < 8; i++) {
chunks[i] = malloc(24);
if (chunks[i] == NULL) { // Check if malloc failed
fprintf(stderr, "Memory allocation failed at iteration %d\n", i);
return 1;
}
printf("Address of chunk %d: %p\n", i, (void *)chunks[i]);
}

// Loop to free the allocated memory
for (i = 0; i < 8; i++) {
free(chunks[i]);
}

return 0;
}

注意我们如何分配和释放8个相同大小的块，以便它们填满tcache，第八个块存储在快速块中。

编译它并在main函数的ret操作码处设置断点进行调试。然后使用gef，你可以看到tcache bin已满，一个块在快速bin中：

bash

gef➤  heap bins
──────────────────────────────────────────────────────────────────────────────── Tcachebins for thread 1 ────────────────────────────────────────────────────────────────────────────────
Tcachebins[idx=0, size=0x20, count=7] ←  Chunk(addr=0xaaaaaaac1770, size=0x20, flags=PREV_INUSE | IS_MMAPPED | NON_MAIN_ARENA)  ←  Chunk(addr=0xaaaaaaac1750, size=0x20, flags=PREV_INUSE | IS_MMAPPED | NON_MAIN_ARENA)  ←  Chunk(addr=0xaaaaaaac1730, size=0x20, flags=PREV_INUSE | IS_MMAPPED | NON_MAIN_ARENA)  ←  Chunk(addr=0xaaaaaaac1710, size=0x20, flags=PREV_INUSE | IS_MMAPPED | NON_MAIN_ARENA)  ←  Chunk(addr=0xaaaaaaac16f0, size=0x20, flags=PREV_INUSE | IS_MMAPPED | NON_MAIN_ARENA)  ←  Chunk(addr=0xaaaaaaac16d0, size=0x20, flags=PREV_INUSE | IS_MMAPPED | NON_MAIN_ARENA)  ←  Chunk(addr=0xaaaaaaac12a0, size=0x20, flags=PREV_INUSE | IS_MMAPPED | NON_MAIN_ARENA)
───────────────────────────────────────────────────────────────────────── Fastbins for arena at 0xfffff7f90b00 ─────────────────────────────────────────────────────────────────────────
Fastbins[idx=0, size=0x20]  ←  Chunk(addr=0xaaaaaaac1790, size=0x20, flags=PREV_INUSE | IS_MMAPPED | NON_MAIN_ARENA)
Fastbins[idx=1, size=0x30] 0x00

未排序的堆

未排序的堆是一个缓存，由堆管理器用于加快内存分配。它的工作原理如下：当程序释放一个块时，如果这个块不能在tcache或快速堆中分配，并且没有与顶部块发生冲突，堆管理器不会立即将其放入特定的小或大堆中。相反，它首先尝试与任何相邻的空闲块合并，以创建一个更大的空闲内存块。然后，它将这个新块放入一个称为“未排序堆”的通用堆中。

当程序请求内存时，堆管理器检查未排序堆以查看是否有足够大小的块。如果找到一个，它会立即使用。如果在未排序堆中找不到合适的块，它会将此列表中的所有块移动到其对应的堆中，依据它们的大小分为小堆或大堆。

请注意，如果一个较大的块被分成两半，并且其余部分大于MINSIZE，它将被放回未排序堆中。

因此，未排序堆是一种通过快速重用最近释放的内存来加速内存分配的方法，从而减少耗时的搜索和合并的需要。

caution

请注意，即使块属于不同类别，如果一个可用块与另一个可用块发生冲突（即使它们最初属于不同的堆），它们也会被合并。

添加一个未排序块示例

#include <stdlib.h>
#include <stdio.h>

int main(void)
{
char *chunks[9];
int i;

// Loop to allocate memory 8 times
for (i = 0; i < 9; i++) {
chunks[i] = malloc(0x100);
if (chunks[i] == NULL) { // Check if malloc failed
fprintf(stderr, "Memory allocation failed at iteration %d\n", i);
return 1;
}
printf("Address of chunk %d: %p\n", i, (void *)chunks[i]);
}

// Loop to free the allocated memory
for (i = 0; i < 8; i++) {
free(chunks[i]);
}

return 0;
}

注意我们如何分配和释放9个相同大小的块，以便它们填充tcache，第八个块存储在未排序的bin中，因为它对于fastbin来说太大，而第九个块没有被释放，因此第九个和第八个不会与顶部块合并。

编译并在main函数的ret操作码处设置断点进行调试。然后使用gef，你可以看到tcache bin已满，一个块在未排序的bin中：

bash

gef➤  heap bins
──────────────────────────────────────────────────────────────────────────────── Tcachebins for thread 1 ────────────────────────────────────────────────────────────────────────────────
Tcachebins[idx=15, size=0x110, count=7] ←  Chunk(addr=0xaaaaaaac1d10, size=0x110, flags=PREV_INUSE | IS_MMAPPED | NON_MAIN_ARENA)  ←  Chunk(addr=0xaaaaaaac1c00, size=0x110, flags=PREV_INUSE | IS_MMAPPED | NON_MAIN_ARENA)  ←  Chunk(addr=0xaaaaaaac1af0, size=0x110, flags=PREV_INUSE | IS_MMAPPED | NON_MAIN_ARENA)  ←  Chunk(addr=0xaaaaaaac19e0, size=0x110, flags=PREV_INUSE | IS_MMAPPED | NON_MAIN_ARENA)  ←  Chunk(addr=0xaaaaaaac18d0, size=0x110, flags=PREV_INUSE | IS_MMAPPED | NON_MAIN_ARENA)  ←  Chunk(addr=0xaaaaaaac17c0, size=0x110, flags=PREV_INUSE | IS_MMAPPED | NON_MAIN_ARENA)  ←  Chunk(addr=0xaaaaaaac12a0, size=0x110, flags=PREV_INUSE | IS_MMAPPED | NON_MAIN_ARENA)
───────────────────────────────────────────────────────────────────────── Fastbins for arena at 0xfffff7f90b00 ─────────────────────────────────────────────────────────────────────────
Fastbins[idx=0, size=0x20] 0x00
Fastbins[idx=1, size=0x30] 0x00
Fastbins[idx=2, size=0x40] 0x00
Fastbins[idx=3, size=0x50] 0x00
Fastbins[idx=4, size=0x60] 0x00
Fastbins[idx=5, size=0x70] 0x00
Fastbins[idx=6, size=0x80] 0x00
─────────────────────────────────────────────────────────────────────── Unsorted Bin for arena at 0xfffff7f90b00 ───────────────────────────────────────────────────────────────────────
[+] unsorted_bins[0]: fw=0xaaaaaaac1e10, bk=0xaaaaaaac1e10
→   Chunk(addr=0xaaaaaaac1e20, size=0x110, flags=PREV_INUSE | IS_MMAPPED | NON_MAIN_ARENA)
[+] Found 1 chunks in unsorted bin.

小型桶

小型桶比大型桶快，但比快速桶慢。

62个桶中的每个桶将具有相同大小的块：16、24，...(在32位中最大为504字节，在64位中最大为1024字节)。这有助于加快查找应分配空间的桶以及在这些列表中插入和删除条目的速度。

小型桶的大小是根据桶的索引计算的：

最小大小：2*4*index（例如，索引5 -> 40）
最大大小：2*8*index（例如，索引5 -> 80）

// From https://github.com/bminor/glibc/blob/a07e000e82cb71238259e674529c37c12dc7d423/malloc/malloc.c#L1711
#define NSMALLBINS         64
#define SMALLBIN_WIDTH    MALLOC_ALIGNMENT
#define SMALLBIN_CORRECTION (MALLOC_ALIGNMENT > CHUNK_HDR_SZ)
#define MIN_LARGE_SIZE    ((NSMALLBINS - SMALLBIN_CORRECTION) * SMALLBIN_WIDTH)

#define in_smallbin_range(sz)  \
((unsigned long) (sz) < (unsigned long) MIN_LARGE_SIZE)

#define smallbin_index(sz) \
((SMALLBIN_WIDTH == 16 ? (((unsigned) (sz)) >> 4) : (((unsigned) (sz)) >> 3))\
+ SMALLBIN_CORRECTION)

选择小型和大型内存池的函数：

#define bin_index(sz) \
((in_smallbin_range (sz)) ? smallbin_index (sz) : largebin_index (sz))

添加一个小块示例

#include <stdlib.h>
#include <stdio.h>

int main(void)
{
char *chunks[10];
int i;

// Loop to allocate memory 8 times
for (i = 0; i < 9; i++) {
chunks[i] = malloc(0x100);
if (chunks[i] == NULL) { // Check if malloc failed
fprintf(stderr, "Memory allocation failed at iteration %d\n", i);
return 1;
}
printf("Address of chunk %d: %p\n", i, (void *)chunks[i]);
}

// Loop to free the allocated memory
for (i = 0; i < 8; i++) {
free(chunks[i]);
}

chunks[9] = malloc(0x110);

return 0;
}

注意我们如何分配和释放9个相同大小的块，以便它们填满tcache，而第八个块存储在未排序的bin中，因为它对于fastbin来说太大，而第九个块没有被释放，因此第九个和第八个块不会与顶部块合并。然后我们分配一个更大的块0x110，这使得未排序bin中的块进入小bin。

编译并在main函数的ret操作码处设置断点进行调试。然后使用gef，你可以看到tcache bin已满，且一个块在小bin中：

bash

gef➤  heap bins
──────────────────────────────────────────────────────────────────────────────── Tcachebins for thread 1 ────────────────────────────────────────────────────────────────────────────────
Tcachebins[idx=15, size=0x110, count=7] ←  Chunk(addr=0xaaaaaaac1d10, size=0x110, flags=PREV_INUSE | IS_MMAPPED | NON_MAIN_ARENA)  ←  Chunk(addr=0xaaaaaaac1c00, size=0x110, flags=PREV_INUSE | IS_MMAPPED | NON_MAIN_ARENA)  ←  Chunk(addr=0xaaaaaaac1af0, size=0x110, flags=PREV_INUSE | IS_MMAPPED | NON_MAIN_ARENA)  ←  Chunk(addr=0xaaaaaaac19e0, size=0x110, flags=PREV_INUSE | IS_MMAPPED | NON_MAIN_ARENA)  ←  Chunk(addr=0xaaaaaaac18d0, size=0x110, flags=PREV_INUSE | IS_MMAPPED | NON_MAIN_ARENA)  ←  Chunk(addr=0xaaaaaaac17c0, size=0x110, flags=PREV_INUSE | IS_MMAPPED | NON_MAIN_ARENA)  ←  Chunk(addr=0xaaaaaaac12a0, size=0x110, flags=PREV_INUSE | IS_MMAPPED | NON_MAIN_ARENA)
───────────────────────────────────────────────────────────────────────── Fastbins for arena at 0xfffff7f90b00 ─────────────────────────────────────────────────────────────────────────
Fastbins[idx=0, size=0x20] 0x00
Fastbins[idx=1, size=0x30] 0x00
Fastbins[idx=2, size=0x40] 0x00
Fastbins[idx=3, size=0x50] 0x00
Fastbins[idx=4, size=0x60] 0x00
Fastbins[idx=5, size=0x70] 0x00
Fastbins[idx=6, size=0x80] 0x00
─────────────────────────────────────────────────────────────────────── Unsorted Bin for arena at 0xfffff7f90b00 ───────────────────────────────────────────────────────────────────────
[+] Found 0 chunks in unsorted bin.
──────────────────────────────────────────────────────────────────────── Small Bins for arena at 0xfffff7f90b00 ────────────────────────────────────────────────────────────────────────
[+] small_bins[16]: fw=0xaaaaaaac1e10, bk=0xaaaaaaac1e10
→   Chunk(addr=0xaaaaaaac1e20, size=0x110, flags=PREV_INUSE | IS_MMAPPED | NON_MAIN_ARENA)
[+] Found 1 chunks in 1 small non-empty bins.

大型内存块

与管理固定大小块的小型内存块不同，每个大型内存块处理一系列块大小。这更灵活，允许系统容纳各种大小，而无需为每种大小单独设置一个内存块。

在内存分配器中，大型内存块从小型内存块结束的地方开始。大型内存块的范围逐渐增大，这意味着第一个内存块可能覆盖从512到576字节的块，而下一个则覆盖576到640字节。这个模式持续下去，最大的内存块包含所有超过1MB的块。

与小型内存块相比，大型内存块的操作速度较慢，因为它们必须对不同块大小的列表进行排序和搜索，以找到最佳适配进行分配。当一个块被插入到大型内存块时，它必须被排序，而当内存被分配时，系统必须找到合适的块。这额外的工作使它们速度较慢，但由于大型分配比小型分配少，因此这是一个可以接受的权衡。

有：

32个64B范围的内存块（与小型内存块冲突）
16个512B范围的内存块（与小型内存块冲突）
8个4096B范围的内存块（部分与小型内存块冲突）
4个32768B范围的内存块
2个262144B范围的内存块
1个用于剩余大小的内存块

大型内存块大小代码

// From https://github.com/bminor/glibc/blob/a07e000e82cb71238259e674529c37c12dc7d423/malloc/malloc.c#L1711

#define largebin_index_32(sz)                                                \
(((((unsigned long) (sz)) >> 6) <= 38) ?  56 + (((unsigned long) (sz)) >> 6) :\
((((unsigned long) (sz)) >> 9) <= 20) ?  91 + (((unsigned long) (sz)) >> 9) :\
((((unsigned long) (sz)) >> 12) <= 10) ? 110 + (((unsigned long) (sz)) >> 12) :\
((((unsigned long) (sz)) >> 15) <= 4) ? 119 + (((unsigned long) (sz)) >> 15) :\
((((unsigned long) (sz)) >> 18) <= 2) ? 124 + (((unsigned long) (sz)) >> 18) :\
126)

#define largebin_index_32_big(sz)                                            \
(((((unsigned long) (sz)) >> 6) <= 45) ?  49 + (((unsigned long) (sz)) >> 6) :\
((((unsigned long) (sz)) >> 9) <= 20) ?  91 + (((unsigned long) (sz)) >> 9) :\
((((unsigned long) (sz)) >> 12) <= 10) ? 110 + (((unsigned long) (sz)) >> 12) :\
((((unsigned long) (sz)) >> 15) <= 4) ? 119 + (((unsigned long) (sz)) >> 15) :\
((((unsigned long) (sz)) >> 18) <= 2) ? 124 + (((unsigned long) (sz)) >> 18) :\
126)

// XXX It remains to be seen whether it is good to keep the widths of
// XXX the buckets the same or whether it should be scaled by a factor
// XXX of two as well.
#define largebin_index_64(sz)                                                \
(((((unsigned long) (sz)) >> 6) <= 48) ?  48 + (((unsigned long) (sz)) >> 6) :\
((((unsigned long) (sz)) >> 9) <= 20) ?  91 + (((unsigned long) (sz)) >> 9) :\
((((unsigned long) (sz)) >> 12) <= 10) ? 110 + (((unsigned long) (sz)) >> 12) :\
((((unsigned long) (sz)) >> 15) <= 4) ? 119 + (((unsigned long) (sz)) >> 15) :\
((((unsigned long) (sz)) >> 18) <= 2) ? 124 + (((unsigned long) (sz)) >> 18) :\
126)

#define largebin_index(sz) \
(SIZE_SZ == 8 ? largebin_index_64 (sz)                                     \
: MALLOC_ALIGNMENT == 16 ? largebin_index_32_big (sz)                     \
: largebin_index_32 (sz))

添加一个大块示例

#include <stdlib.h>
#include <stdio.h>

int main(void)
{
char *chunks[2];

chunks[0] = malloc(0x1500);
chunks[1] = malloc(0x1500);
free(chunks[0]);
chunks[0] = malloc(0x2000);

return 0;
}

2 个大分配被执行，然后一个被释放（将其放入未排序的桶中），并且进行更大的分配（将释放的从未排序的桶移动到大桶中）。

编译并在 main 函数的 ret 操作码处设置断点进行调试。然后使用 gef，你可以看到 tcache 桶已满，并且一个块在大桶中：

bash

gef➤  heap bin
──────────────────────────────────────────────────────────────────────────────── Tcachebins for thread 1 ────────────────────────────────────────────────────────────────────────────────
All tcachebins are empty
───────────────────────────────────────────────────────────────────────── Fastbins for arena at 0xfffff7f90b00 ─────────────────────────────────────────────────────────────────────────
Fastbins[idx=0, size=0x20] 0x00
Fastbins[idx=1, size=0x30] 0x00
Fastbins[idx=2, size=0x40] 0x00
Fastbins[idx=3, size=0x50] 0x00
Fastbins[idx=4, size=0x60] 0x00
Fastbins[idx=5, size=0x70] 0x00
Fastbins[idx=6, size=0x80] 0x00
─────────────────────────────────────────────────────────────────────── Unsorted Bin for arena at 0xfffff7f90b00 ───────────────────────────────────────────────────────────────────────
[+] Found 0 chunks in unsorted bin.
──────────────────────────────────────────────────────────────────────── Small Bins for arena at 0xfffff7f90b00 ────────────────────────────────────────────────────────────────────────
[+] Found 0 chunks in 0 small non-empty bins.
──────────────────────────────────────────────────────────────────────── Large Bins for arena at 0xfffff7f90b00 ────────────────────────────────────────────────────────────────────────
[+] large_bins[100]: fw=0xaaaaaaac1290, bk=0xaaaaaaac1290
→   Chunk(addr=0xaaaaaaac12a0, size=0x1510, flags=PREV_INUSE | IS_MMAPPED | NON_MAIN_ARENA)
[+] Found 1 chunks in 1 large non-empty bins.

顶部块

// From https://github.com/bminor/glibc/blob/a07e000e82cb71238259e674529c37c12dc7d423/malloc/malloc.c#L1711

/*
Top

The top-most available chunk (i.e., the one bordering the end of
available memory) is treated specially. It is never included in
any bin, is used only if no other chunk is available, and is
released back to the system if it is very large (see
M_TRIM_THRESHOLD).  Because top initially
points to its own bin with initial zero size, thus forcing
extension on the first malloc request, we avoid having any special
code in malloc to check whether it even exists yet. But we still
need to do so when getting memory from system, so we make
initial_top treat the bin as a legal but unusable chunk during the
interval between initialization and the first call to
sysmalloc. (This is somewhat delicate, since it relies on
the 2 preceding words to be zero during this interval as well.)
*/

/* Conveniently, the unsorted bin can be used as dummy top on first call */
#define initial_top(M)              (unsorted_chunks (M))

基本上，这是一个包含所有当前可用堆的块。当执行 malloc 时，如果没有可用的空闲块可用，这个顶块将会减少其大小以提供必要的空间。
指向顶块的指针存储在 malloc_state 结构中。

此外，在开始时，可以将未排序块用作顶块。

观察顶块示例

#include <stdlib.h>
#include <stdio.h>

int main(void)
{
char *chunk;
chunk = malloc(24);
printf("Address of the chunk: %p\n", (void *)chunk);
gets(chunk);
return 0;
}

在编译并在 main 的 ret 操作码处调试时，我看到 malloc 返回了地址 0xaaaaaaac12a0，这些是块：

bash

gef➤  heap chunks
Chunk(addr=0xaaaaaaac1010, size=0x290, flags=PREV_INUSE | IS_MMAPPED | NON_MAIN_ARENA)
[0x0000aaaaaaac1010     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00    ................]
Chunk(addr=0xaaaaaaac12a0, size=0x20, flags=PREV_INUSE | IS_MMAPPED | NON_MAIN_ARENA)
[0x0000aaaaaaac12a0     41 41 41 41 41 41 41 00 00 00 00 00 00 00 00 00    AAAAAAA.........]
Chunk(addr=0xaaaaaaac12c0, size=0x410, flags=PREV_INUSE | IS_MMAPPED | NON_MAIN_ARENA)
[0x0000aaaaaaac12c0     41 64 64 72 65 73 73 20 6f 66 20 74 68 65 20 63    Address of the c]
Chunk(addr=0xaaaaaaac16d0, size=0x410, flags=PREV_INUSE | IS_MMAPPED | NON_MAIN_ARENA)
[0x0000aaaaaaac16d0     41 41 41 41 41 41 41 0a 00 00 00 00 00 00 00 00    AAAAAAA.........]
Chunk(addr=0xaaaaaaac1ae0, size=0x20530, flags=PREV_INUSE | IS_MMAPPED | NON_MAIN_ARENA)  ←  top chunk

可以看到顶部块位于地址 0xaaaaaaac1ae0。这并不奇怪，因为最后分配的块在 0xaaaaaaac12a0，大小为 0x410，并且 0xaaaaaaac12a0 + 0x410 = 0xaaaaaaac1ae0。
还可以在其块头中看到顶部块的长度：

bash

gef➤  x/8wx 0xaaaaaaac1ae0 - 16
0xaaaaaaac1ad0:	0x00000000	0x00000000	0x00020531	0x00000000
0xaaaaaaac1ae0:	0x00000000	0x00000000	0x00000000	0x00000000

查看 订阅计划!
加入 💬 Discord 群组 或 Telegram 群组 或在 Twitter 🐦 上关注我们 @hacktricks_live.
通过向 HackTricks 和 HackTricks Cloud GitHub 仓库提交 PR 来分享黑客技巧。

HackTricks

Bins & Memory Allocations

基本信息

Tcache（每线程缓存）Bins

Tcache 结构和函数

Tcache 索引

快速 bins

未排序的堆

小型桶

大型内存块

顶部块

最后剩余

分配流程

释放流程

堆函数安全检查

参考文献