Based on the text you provided, here is a comprehensive summary and analysis of the Copy Fail (CVE-2026-31431) exploit, how it interacts with Podman rootless containers, and the specific mitigation strategies discussed.
- Nature: A Remote Code Execution (RCE) vulnerability.
- Mechanism: It exploits a way to write to the Linux page cache and leverage the
sucommand. - The Trigger: The script allows an unprivileged user inside a container to execute
suto obtain a root shell inside the container without entering a password prompt. - Impact:
- In a standard setup,
suas a non-root user prompts for a password. - With this exploit,
susucceeds immediately, grantinguid=0(root) inside the container namespace. - This allows an attacker to escalate privileges within the container and potentially use the container as a pivot point for attacks on the host.
- In a standard setup,
The text walks through a series of tests using a custom copyfail container image containing python3, curl, and an HTTP server. The goal is to see how different security configurations stop the exploit from escalating to full host compromise.
- Setup: Running as root inside the container.
- Result: The exploit works as expected (no password needed for
su). -
Limitation: Since the container process is already root, gaining another root shell adds no new privileges. However, capabilities and file system access remain unchanged.
-
Setup: Running as
foo(user id 1002) inside the container. - Result: The exploit successfully escalates to
uid=0inside the container. - Implication: The attacker now has a root shell in the container. However, because it is rootless, the container process on the host still runs as the unprivileged user
bar. The attacker cannot access host files (e.g.,/test/root.txt) without specific host privileges.
The text outlines four layers of defense-in-depth to limit the "blast radius" of this exploit.
- Command:
podman run ... --security-opt=no-new-privileges - Effect: Prevents the container process from gaining new privileges (like becoming root) via setuid binaries or the
sucommand. -
Outcome: Running the exploit results in the process remaining
uid=1002 (foo). The attacker gains a shell, but it is still the unprivileged user. They are stuck with whatever permissionsfoohad originally. -
Command:
podman run ... --cap-drop=all - Effect: Strips all Linux capabilities from the container process.
- Outcome: Similar to the previous test, the exploit fails to elevate privileges. The process remains
foowithnonecapabilities. -
Note: The text suggests combining this with
--security-opt=no-new-privilegesfor maximum safety. -
Command:
podman run ... --read-only --read-only-tmpfs=false - Effect: Mounts the entire container filesystem as read-only. Even writable volumes (like
/test) remain writeable, but the system directories do not change. - Outcome: Prevents the attacker from writing logs, modifying binaries, or creating new files to stage further attacks.
-
Caveat: Many standard images (like
ubuntu) expect write access to/tmpor/var/lib. To make a container completely immutable, you must use a distroless image or ensure the application doesn't need to write to the container filesystem. -
Resource Limits: Using cgroups to limit CPU, Memory, and PIDs to contain the impact if the exploit is triggered.
- Minimal Images:
- Using slim variants (e.g.,
debian-slim) oralpine. - Using distroless or scratch images (no shell, no package manager, no
curl).
- Using slim variants (e.g.,
-
Outcome: Even if the exploit runs, it has fewer tools to leverage for lateral movement or persistence within the container.
-
Tool:
iptablesornftables. - Effect: Restricting traffic to only established connections or specific ports needed for the application.
- Outcome: Prevents the container from acting as a pivot to attack other internal services.
- Rootless is not Immune: Simply running a container without root privileges does not prevent the Copy Fail exploit. The vulnerability allows an unprivileged user to become
rootinside the container namespace. - Host Isolation is Key: Because rootless containers run as an unprivileged user on the host, even if the container process becomes
root, it generally cannot read host files or execute host commands unless the user has specific host access. - Defense in Depth: No single flag stops the exploit completely. However, combining
no-new-privileges,cap-drop=all, and a read-only filesystem effectively neutralizes the threat by ensuring the attacker remains an unprivileged user with no capabilities and a read-only environment. - Kernel Patching: The text emphasizes that these mitigations limit the damage but do not fix the underlying bug. Patching the kernel to a version that fixes the underlying page cache vulnerability is the most critical step.
Final Recommendation: For production environments, use a combination of:
* A patched kernel.
* --security-opt=no-new-privileges
* --cap-drop=all (or specific capability dropping)
* --read-only where possible.
* Minimal/Distroless base images.