Ядро Linux под ударом уязвимость Copy Fail в Podman

Based on the text you provided, here is a comprehensive summary and analysis of the Copy Fail (CVE-2026-31431) exploit, how it interacts with Podman rootless containers, and the specific mitigation strategies discussed.

Nature: A Remote Code Execution (RCE) vulnerability.
Mechanism: It exploits a way to write to the Linux page cache and leverage the su command.
The Trigger: The script allows an unprivileged user inside a container to execute su to obtain a root shell inside the container without entering a password prompt.
Impact:
- In a standard setup, su as a non-root user prompts for a password.
- With this exploit, su succeeds immediately, granting uid=0 (root) inside the container namespace.
- This allows an attacker to escalate privileges within the container and potentially use the container as a pivot point for attacks on the host.

The text walks through a series of tests using a custom copyfail container image containing python3, curl, and an HTTP server. The goal is to see how different security configurations stop the exploit from escalating to full host compromise.

Setup: Running as root inside the container.
Result: The exploit works as expected (no password needed for su).
Limitation: Since the container process is already root, gaining another root shell adds no new privileges. However, capabilities and file system access remain unchanged.
Setup: Running as foo (user id 1002) inside the container.
Result: The exploit successfully escalates to uid=0 inside the container.
Implication: The attacker now has a root shell in the container. However, because it is rootless, the container process on the host still runs as the unprivileged user bar. The attacker cannot access host files (e.g., /test/root.txt) without specific host privileges.

The text outlines four layers of defense-in-depth to limit the "blast radius" of this exploit.

Command: podman run ... --security-opt=no-new-privileges
Effect: Prevents the container process from gaining new privileges (like becoming root) via setuid binaries or the su command.
Outcome: Running the exploit results in the process remaining uid=1002 (foo). The attacker gains a shell, but it is still the unprivileged user. They are stuck with whatever permissions foo had originally.
Command: podman run ... --cap-drop=all
Effect: Strips all Linux capabilities from the container process.
Outcome: Similar to the previous test, the exploit fails to elevate privileges. The process remains foo with none capabilities.
Note: The text suggests combining this with --security-opt=no-new-privileges for maximum safety.
Command: podman run ... --read-only --read-only-tmpfs=false
Effect: Mounts the entire container filesystem as read-only. Even writable volumes (like /test) remain writeable, but the system directories do not change.
Outcome: Prevents the attacker from writing logs, modifying binaries, or creating new files to stage further attacks.
Caveat: Many standard images (like ubuntu) expect write access to /tmp or /var/lib. To make a container completely immutable, you must use a distroless image or ensure the application doesn't need to write to the container filesystem.
Resource Limits: Using cgroups to limit CPU, Memory, and PIDs to contain the impact if the exploit is triggered.
Minimal Images:
- Using slim variants (e.g., debian-slim) or alpine.
- Using distroless or scratch images (no shell, no package manager, no curl).
Outcome: Even if the exploit runs, it has fewer tools to leverage for lateral movement or persistence within the container.
Tool: iptables or nftables.
Effect: Restricting traffic to only established connections or specific ports needed for the application.
Outcome: Prevents the container from acting as a pivot to attack other internal services.

Rootless is not Immune: Simply running a container without root privileges does not prevent the Copy Fail exploit. The vulnerability allows an unprivileged user to become root inside the container namespace.
Host Isolation is Key: Because rootless containers run as an unprivileged user on the host, even if the container process becomes root, it generally cannot read host files or execute host commands unless the user has specific host access.
Defense in Depth: No single flag stops the exploit completely. However, combining no-new-privileges, cap-drop=all, and a read-only filesystem effectively neutralizes the threat by ensuring the attacker remains an unprivileged user with no capabilities and a read-only environment.
Kernel Patching: The text emphasizes that these mitigations limit the damage but do not fix the underlying bug. Patching the kernel to a version that fixes the underlying page cache vulnerability is the most critical step.

Final Recommendation: For production environments, use a combination of: * A patched kernel. * --security-opt=no-new-privileges * --cap-drop=all (or specific capability dropping) * --read-only where possible. * Minimal/Distroless base images.