Adreno A7xx SDS->RB privilege bypass (GPU SMMU takeover to Kernel R/W)

Reading time: 7 minutes

tip

Aprende y practica Hacking en AWS:HackTricks Training AWS Red Team Expert (ARTE)
Aprende y practica Hacking en GCP: HackTricks Training GCP Red Team Expert (GRTE) Aprende y practica Hacking en Azure: HackTricks Training Azure Red Team Expert (AzRTE)

Apoya a HackTricks

Revisa los planes de suscripción!
Únete al 💬 grupo de Discord o al grupo de telegram o síguenos en Twitter 🐦 @hacktricks_live.
Comparte trucos de hacking enviando PRs a los HackTricks y HackTricks Cloud repositorios de github.

Esta página abstrae un bug de lógica del microcode real (en la naturaleza) en Adreno A7xx (CVE-2025-21479) en técnicas de explotación reproducibles: abusar del enmascaramiento a nivel IB en Set Draw State (SDS) para ejecutar paquetes GPU privilegiados desde una app sin privilegios, pivotar a la toma de control del GPU SMMU y luego a un R/W rápido y estable del kernel mediante un truco de dirty-pagetable.

Affected: Qualcomm Adreno A7xx GPU firmware prior to a microcode fix that changed masking of register $12 from 0x3 to 0x7.
Primitive: Execute privileged CP packets (e.g., CP_SMMU_TABLE_UPDATE) from SDS, which is user-controlled.
Outcome: Arbitrary physical/virtual kernel memory R/W, SELinux disable, root.
Prereq: Ability to create a KGSL GPU context and submit command buffers that enter SDS (normal app capability).

Antecedentes: IB levels, SDS and the $12 mask

El kernel mantiene un ringbuffer (RB=IB0). Userspace envía IB1 vía CP_INDIRECT_BUFFER, encadenando a IB2/IB3.
SDS es un flujo de comandos especial que se entra vía CP_SET_DRAW_STATE:
A6xx: SDS is treated as IB3
A7xx: SDS moved to IB4
El microcode rastrea el nivel IB actual en el register $12 y filtra los paquetes privilegiados de modo que solo se acepten cuando el nivel efectivo corresponda a IB0 (kernel RB).
Bug: A7xx microcode kept masking $12 with 0x3 (2 bits) instead of 0x7 (3 bits). Since IB4 & 0x3 == 0, SDS was misidentified as IB0, allowing privileged packets from user-controlled SDS.

Por qué importa:

A6XX                | A7XX
RB  & 3       == 0  |  RB  & 3       == 0
IB1 & 3       == 1  |  IB1 & 3       == 1
IB2 & 3       == 2  |  IB2 & 3       == 2
IB3 (SDS) & 3 == 3  |  IB3 & 3       == 3
|  IB4 (SDS) & 3 == 0   <-- misread as IB0 if mask is 0x3

Ejemplo de diff de microcódigo (el parche cambió la máscara a 0x7):

@@ CP_SMMU_TABLE_UPDATE
- and $02, $12, 0x3
+ and $02, $12, 0x7
@@ CP_FIXED_STRIDE_DRAW_TABLE
- and $02, $12, 0x3
+ and $02, $12, 0x7

Descripción general de la explotación

Goal: Desde SDS (misread as IB0) emitir paquetes CP privilegiados para reapuntar el GPU SMMU a page tables creadas por el atacante, luego usar paquetes de copia/escritura de la GPU para R/W físico arbitrario. Finalmente, pivotar a un R/W rápido del lado CPU vía dirty pagetable.

High-level chain

Craft a fake GPU pagetable in shared memory
Enter SDS and execute:
CP_SMMU_TABLE_UPDATE -> switch to fake pagetable
CP_MEM_WRITE / CP_MEM_TO_MEM -> implement write/read primitives
CP_SET_DRAW_STATE with run-now flags (dispatch immediately)

GPU R/W primitives via fake pagetable

Write: CP_MEM_WRITE to an attacker-chosen GPU VA whose PTEs you map to a chosen PA -> arbitrary physical write
Read: CP_MEM_TO_MEM copies 4/8 bytes from target PA to a userspace-shared buffer (batch for larger reads)

Notes

Each Android process gets a KGSL context (IOCTL_KGSL_GPU_CONTEXT_CREATE). Switching contexts normally updates SMMU tables in the RB; the bug lets you do it in SDS.
Excessive GPU traffic can cause UI blackouts and reboots; reads are small (4/8B) and sync is slow by default.

Construcción de la secuencia de comandos SDS

Spray a fake GPU pagetable into shared memory so at least one instance lands at a known physical address (e.g., via allocator grooming and repetition).
Construct an SDS buffer containing, in order:

CP_SMMU_TABLE_UPDATE to the physical address of the fake pagetable
One or more CP_MEM_WRITE and/or CP_MEM_TO_MEM packets to implement R/W using your new translations
CP_SET_DRAW_STATE with flags to run-now

The exact packet encodings vary by firmware; use freedreno’s afuc/packet docs to assemble the words, and ensure the SDS submission path is taken by the driver.

Encontrar el physbase del kernel de Samsung bajo KASLR físico

Samsung randomizes the kernel physical base within a known region on Snapdragon devices. Brute-force the expected range and look for the first 16 bytes of _stext.

Representative loop

while (!ctx->kernel.pbase) {
offset += 0x8000;
uint64_t d1 = kernel_physread_u64(ctx, base + offset);
if (d1 != 0xd10203ffd503233f) continue;   // first 8 bytes of _stext
uint64_t d2 = kernel_physread_u64(ctx, base + offset + 8);
if (d2 == 0x910083fda9027bfd) {           // second 8 bytes of _stext
ctx->kernel.pbase = base + offset - 0x10000;
break;
}
}

Una vez conocido physbase, calcula la dirección virtual del kernel con el mapeo lineal:

_stext = 0xffffffc008000000 + (Kernel Code & ~0xa8000000)

Estabilizar hacia un R/W de kernel rápido y fiable del lado de la CPU (dirty pagetable)

GPU R/W es lento y de granularidad pequeña. Pivot hacia un primitive rápido/estable corrompiendo los PTEs de tu propio proceso (“dirty pagetable”):

Steps

Localizar el task_struct actual -> mm_struct -> mm_struct->pgd usando los primitivos lentos de GPU R/W
mmap dos páginas de espacio de usuario adyacentes A y B (p. ej., en 0x1000)
Recorrer PGD->PMD->PTE para resolver las direcciones físicas de PTE de A/B (helpers: get_pgd_offset, get_pmd_offset, get_pte_offset)
Sobrescribir el PTE de B para que apunte a la pagetable de último nivel que gestiona A/B con atributos RW (phys_to_readwrite_pte)
Escribir vía el VA de B para mutar el PTE de A y mapear las PFNs objetivo; leer/escribir memoria kernel vía el VA de A, flushing TLB hasta que un sentinel cambie

Ejemplo de fragmento de pivot dirty-pagetable

uint64_t *map = mmap((void*)0x1000, PAGE_SIZE*2, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_ANONYMOUS, 0, 0);
uint64_t *page_map = (void*)((uint64_t)map + PAGE_SIZE);
page_map[0] = 0x4242424242424242;

uint64_t tsk = get_curr_task_struct(ctx);
uint64_t mm = kernel_vread_u64(ctx, tsk + OFFSETOF_TASK_STRUCT_MM);
uint64_t mm_pgd = kernel_vread_u64(ctx, mm + OFFSETOF_MM_PGD);

uint64_t pgd_off = get_pgd_offset((uint64_t)map);
uint64_t phys_pmd = kernel_vread_u64(ctx, mm_pgd + pgd_off) & ~((1<<12)-1);
uint64_t pmd_off = get_pmd_offset((uint64_t)map);
uint64_t phys_pte = kernel_pread_u64(ctx, phys_pmd + pmd_off) & ~((1<<12)-1);
uint64_t pte_off = get_pte_offset((uint64_t)map);
uint64_t pte_addr = phys_pte + pte_off;
uint64_t new_pte = phys_to_readwrite_pte(pte_addr);
kernel_write_u64(ctx, pte_addr + 8, new_pte, false);
while (page_map[0] == 0x4242424242424242) flush_tlb();

Detección y endurecimiento

Firmware/microcode: corregir todos los sitios que enmascaran $12 para usar 0x7 (A7xx) y auditar las puertas de paquetes privilegiados
Driver: validar el nivel IB efectivo para paquetes privilegiados y aplicar listas de permitidos por contexto
Telemetry: alertar si CP_SMMU_TABLE_UPDATE (o opcodes privilegiados similares) aparece fuera de RB/IB0, especialmente en SDS; supervisar ráfagas anómalas de 4/8-byte CP_MEM_TO_MEM y patrones excesivos de TLB flush
Kernel: endurecer la metadata de pagetable y detectar patrones de corrupción de PTE de usuario

Impacto

Una aplicación local con acceso a la GPU puede ejecutar paquetes GPU privilegiados, secuestrar el SMMU de la GPU, obtener R/W física/virtual arbitraria del kernel, desactivar SELinux y conseguir root en dispositivos Snapdragon A7xx afectados (p. ej., Samsung S23). Severidad: Alta (compromiso del kernel).

Referencias

tip

Apoya a HackTricks

Revisa los planes de suscripción!
Únete al 💬 grupo de Discord o al grupo de telegram o síguenos en Twitter 🐦 @hacktricks_live.
Comparte trucos de hacking enviando PRs a los HackTricks y HackTricks Cloud repositorios de github.