Copy Fail (CVE-2026-31431): How a 4-Byte Kernel Write Escapes Every Cloud Container Built Since 2017

Copy Fail (CVE-2026-31431): How a 4-Byte Kernel Write Escapes Every Cloud Container Built Since 2017

Key finding

A nine-year-old Linux kernel optimisation has a four-byte primitive that is enough to escape a Kubernetes pod admitted under Pod Security Standards Restricted, with RuntimeDefault seccomp and every Linux capability dropped. That is the headline result of CVE-2026-31431 — nicknamed “Copy Fail” — added to the CISA Known Exploited Vulnerabilities catalogue on May 1, 2026, with a federal patch deadline of May 15. If your fleet runs any mainline Linux distribution shipped after late 2017 — which means effectively every AMI, every Azure marketplace image, every GCE base image, and every default Kubernetes node pool — the host kernel is the real attack surface here, not the workload.

This article walks through the kernel-level mechanics of Copy Fail, why containerisation does not protect you, what a realistic blast radius looks like in a multi-tenant Kubernetes cluster, and the specific actions cloud and platform teams should be taking this week.

Technical analysis

Copy Fail lives in the kernel’s userspace cryptography API — the AF_ALG socket family — specifically inside the algif_aead module, which exposes Authenticated Encryption with Associated Data (AEAD) algorithms to userspace processes through a Unix-style socket interface.

In 2017, kernel maintainers landed a performance optimisation that lets AEAD operations run in place by setting the source and destination buffers to the same memory region. The intent was straightforward: avoid an extra copy when you can encrypt or decrypt directly into the same scatterlist. The bug is in how that optimisation interacts with splice(2). When a userspace process splices a readable file into an AF_ALG socket, the kernel does not copy the file contents into kernel buffers. Instead, it passes references to the file’s page cache pages — the same pages the page cache uses to serve read() calls for that file across the entire system.

Because the in-place optimisation reuses the source memory as the destination, an attacker who controls the AEAD parameters can drive a controlled four-byte write into the page cache of any file the attacker can read. That includes every world-readable file in the filesystem: /etc/passwd, every binary on PATH, and — most importantly — privileged setuid binaries like /usr/bin/su and /usr/bin/sudo.

Four bytes does not sound like much. It is enough. Public exploit code chains the primitive into a setuid-binary corruption sequence: replace four bytes near the prologue of a privileged binary with an instruction that pivots execution to attacker-controlled bytes already present elsewhere in the same page, then trigger the binary to gain root. The published proof of concept is 732 bytes of Python, deterministic, and does not rely on race conditions or memory layout speculation. It works the first time, every time, on a vulnerable kernel.

The class of bug here is worth naming clearly. Copy Fail is not a memory corruption flaw in the traditional sense. The kernel itself is doing exactly what it was told to do; the in-place semantics for the AEAD path were a deliberate optimisation. The vulnerability is a semantic one — the optimisation assumed source and destination buffers belonged to the calling process, when in fact splice() makes them belong to the page cache, which is shared.

Impact assessment

The mainline fix landed on April 1, 2026, but the public disclosure window has now opened and exploitation in the wild has been confirmed by CISA. Five impact dimensions matter for cloud security teams.

1. Every modern Linux distribution is in scope. All kernels from 4.14 onwards are affected on systems where algif_aead is available — which is the default on Amazon Linux, RHEL, Rocky, AlmaLinux, Ubuntu LTS, Debian, SUSE, and the upstream kernels shipped in EKS, AKS, and GKE node images. CloudLinux and Ubuntu have shipped patches; several other distributions are still rolling theirs out.

2. Container isolation is not a control here. This is the part that should sober every platform team. A non-root pod, with all Linux capabilities dropped, with RuntimeDefault seccomp applied, admitted under Pod Security Standards Restricted, can still reach the vulnerable kernel path. The default seccomp profile in containerd and CRI-O does not block the socket(AF_ALG, SOCK_SEQPACKET, 0) call required to start the chain. Since the page cache is host-wide, a write inside the pod’s namespace lands in pages that other namespaces — including the host PID 1 — read from. Container escape is not a follow-on exploit; it is the same primitive.

3. The blast radius scales with multi-tenancy. A single compromised pod on a shared node compromises the node, and from the node, every other pod scheduled on it. In hosted Kubernetes services where customers share node pools — the default for most cost-sensitive deployments — Copy Fail is a cross-tenant primitive. CI/CD runners that schedule untrusted PR builds on shared infrastructure are an immediate concern, as are managed PaaS environments running customer code in side-by-side pods.

4. Federal deadline is May 15. CISA’s KEV listing on May 1 set a 14-day federal patch deadline. For any Federal Civilian Executive Branch agency or federal contractor with downstream obligations, this is now a compliance event, not just a security one.

5. Hosting and managed-service providers are upstream. OVHcloud and other managed Kubernetes providers have already shipped customer-facing mitigations. AWS, Azure, and GCP have published guidance on patching customer-managed nodes. Patch latency on customer-managed worker nodes is going to be the long tail of this incident.

CloudShieldSecure perspective

The Copy Fail incident is a near-perfect case study in why workload-layer detection and host-kernel posture management have to live in the same control plane. From the CloudShieldSecure detection side, we look for three signals that Copy Fail exploitation produces: (1) processes inside a pod opening AF_ALG sockets when the pod’s image has no legitimate reason to use userspace crypto, (2) writes to the page cache of setuid binaries that are not preceded by a package manager event, and (3) post-compromise privilege transitions on container hosts that don’t correspond to an authorised kubectl exec or SSH session.

On the posture side, CloudShieldSecure’s continuous workload scoring flags nodes where the running kernel has not received a Copy Fail patch yet, and clusters where the default seccomp profile does not block AF_ALG socket creation. The fastest mitigation for most teams is not waiting on a vendor kernel patch at all — it is blacklisting the algif_aead module on every node, which our platform recommends and verifies as a zero-cost configuration change. We also surface every running pod whose image’s actual capability surface is wider than its declared capability surface, since this gap is what makes “PSS Restricted” stop being a meaningful control in incidents like this one.

The deeper lesson — and it is one we keep repeating — is that container security policy is only as strong as the kernel underneath it. PSS Restricted, Linux capabilities, seccomp RuntimeDefault and AppArmor profiles are all built on the assumption that the kernel boundary they rely on is intact. When the kernel ships a four-byte primitive across that boundary, every layer above it stops mattering.

The right sequence for most teams in the next 72 hours:

  1. Inventory kernel versions across every Kubernetes node pool, every cloud VM, and every CI runner. Anything older than the mainline post-April 1 patch level is in scope.
  2. Apply the vendor kernel patch on host nodes. Ubuntu, AlmaLinux, RHEL, and CloudLinux have all shipped fixes; refresh node pool AMIs and cycle nodes.
  3. For nodes that cannot be patched immediately, blacklist the module. Add install algif_aead /bin/false to /etc/modprobe.d/disable-algif-aead.conf and run rmmod algif_aead. This kills the attack path without a reboot, on the assumption that no legitimate workload on the host uses kernel-userspace AEAD.
  4. Tighten the default seccomp profile to deny socket(AF_ALG, ...) at the container runtime level. This is a defence-in-depth measure that prevents the exploit even if the host kernel is unpatched.
  5. Audit image manifests for unnecessary cryptographic dependencies. Most application containers do not need AF_ALG — minimal base images and reproducible builds make this auditable.
  6. Watch for AF_ALG socket creation in your runtime detection layer. This was a low-prevalence telemetry signal before May 1; it is now the cleanest single indicator of Copy Fail exploitation in the wild.
  7. For multi-tenant clusters, isolate untrusted workloads onto patched node pools first. CI runners executing third-party PR code, managed PaaS workloads, and any customer-tenancy boundary should be the initial focus.

The shape of this incident — a deeply technical kernel bug that quietly invalidates higher-level container isolation — is one we should expect to see more of, not less. The right structural posture is to treat the Linux kernel as part of the workload’s threat model, not an opaque layer underneath it.

Sources & References


Published by CloudShieldSecure — the unified cloud workload protection and security posture platform from CloudKonsult Limited. Continuous kernel posture scoring, runtime detection of AF_ALG socket abuse, and seccomp policy verification are part of the platform’s default coverage. Learn more at cloudshieldsecure.cloudkonsult.cloud

Assess your security posture today

CloudShield Secure scans, validates, and prioritises threats across your entire attack surface.

Explore CloudShield Secure →
← When Your Endpoint Goes Quiet: Detecting Defender … Copy Fail (CVE-2026-31431): The Linux Kernel Flaw … →