New Container Exploit: Rooting Non-Root Containers with CVE-2023-2640 and CVE-2023-32629, aka GameOver(lay)
- Two new privilege escalation CVEs, CVE-2023-2640 and CVE-2023-32629, have been discovered in the Ubuntu kernel OverlayFS module. The CVEs affect not only any Ubuntu hosts running with vulnerable kernel versions but also any containers running on those hosts.
- CrowdStrike has discovered that CVE-2023-2640 and CVE-2023-32629 can be used to root the non-root containers under certain circumstances using these vulnerabilities. Once “container root” is achieved, attackers can use traditional container escape techniques depending on the attack surface available.
- It is paramount to detect the exploitation of these vulnerabilities on containers in addition to the host. The CrowdStrike Falcon® platform detects and prevents exploitation of the vulnerabilities on hosts as well as containers (Docker and Kubernetes).
- The Falcon platform helps protect organizations of all sizes from sophisticated breaches from nation-state, eCrime and hacktivist groups.
Two new local privilege escalation vulnerabilities were recently discovered in Ubuntu: CVE-2023-2640 (CVSS 7.8) and CVE-2023-32629 (CVSS 7.8). The vulnerabilities, dubbed GameOver(lay), affect the OverlayFS module in multiple Ubuntu kernels. Ubuntu’s official security bulletin here and here outlines the impacted versions by both CVEs. It’s important to note that CrowdStrike Falcon® Cloud Security protects against both vulnerabilities.
The CrowdStrike cloud threat research team analyzed these vulnerabilities and discovered a way to use them to exploit containers. Under certain conditions, a non-root container user can escalate privileges within a container to get to container root. They can then further escape the container with a traditional exploit to compromise the host.
On July 28, 2023, one day after the public disclosure, a tweet disclosed a one-line exploit that uses CVE-2023-2640 to escalate privileges on vulnerable Ubuntu kernels. The tweet highlights the ease of exploitation of this vulnerability and justifies its CVSS score.
This blog explains the details of how the CrowdStrike cloud threat research team discovered this new container exploitation method. Before we get into the details, let’s first discuss the underlying concept of OverlayFS.
What Is OverlayFS?
As the name suggests, OverlayFS is a union mount filesystem where one directory tree (usually read-write) is typically overlaid on top of another directory tree (usually read-only). In OverlayFS, all of the modifications go to the upper writable layer, and the lower layer is read-only.
In the modern world, OverlayFS is one of the fundamental building blocks of containers and Kubernetes, where the image(base) layer is the read-only lower layer and the container layer is upper and writable.
Because the kernel is involved in the interaction of files between the lower layer and upper layer, it is an intriguing target for exploitation. Multiple vulnerabilities have been found in OverlayFS in the past. Figure 1 shows how, with an OverlayFS created by a mount
, a file like a Python binary can be copied from the lower layer to the upper layer using a “merged” directory with a simple touch
command. Here, the kernel needs to enforce namespace restrictions by limiting the capabilities of the file, including extended attributes of the Python binary, as it moves to the upper layer.
How CVE-2023-2640 and CVE-2023-32629 Affect OverlayFS
At the heart of both CVE-2023-2640 and CVE-2023-32629 is an operation where files from the lower directory are copied to the upper directory with extended file attributes intact. This means if a file in the lower directory has capabilities like CAP_SYS_ADMIN or CAP_SETUID, these capabilities are carried over to the upper layer, where a non-root user can merely execute the upper layer file to gain root privileges and achieve privilege escalation.
Both the vulnerabilities originate in a kernel function named ovl_do_setxattr. This function calls a vulnerable wrapper __vfs_setxattr_noperm
, which does not restrict the file security capabilities to a namespace. As Figure 2 shows, vulnerable code flows to where two functions mount the OverlayFS (ovl_copy_xattr
and ovl_copy_up_meta_inode_data
) in a namespace and call the vulnerable function ovl_do_setxattr
, which eventually triggers the vulnerability.
Figure 3 shows a tweaked proof of concept from the aforementioned tweet that both tests the vulnerable code flows and provides bash
shell with root privileges.
Rooting the Non-Root Privileged Containers
As the CrowdStrike cloud threat research team looked into these vulnerabilities, they were faced with a couple of questions related to containers:
- Can these vulnerabilities be used to escalate privileges inside non-root containers?
- Can these vulnerabilities be used to break out of a container to compromise a host?
If a container with a non-root user is compromised, the attacker must first achieve root privileges inside the container to even attempt a container breakout, which is extremely difficult.
To answer the first question, a container uses an OverlayFS to manage its runtime operation (container layer). The feature design itself prevents the creation of OverlayFS on top of OverlayFS (nested OverlayFS), restricting privilege escalation. Figure 4 shows a failed attempt. Because both vulnerabilities involve namespace creation using unshare
with -m flag, the container needs to be privileged with no seccomp profile, which is a default configuration in Kubernetes.
Here, the second approach could be to create an ephemeral tmpfs
and mount an OverlayFS on top of it, essentially creating an in-memory filesystem structure. Though it will quickly be discovered, a file with acquired (CAP_SETUID) capabilities can be moved to the upper directory, but execution of the file beyond the created namespace is a challenge as a non-root user has no way to access this in-memory file from the newly created namespace. As Figure 5 shows, if a file is copied out of memory onto the disk to execute by user “low,” the capabilities are sanitized and potential privilege escalation fails.
Volume Mount to the Rescue
A containerized application may use volume mounts to add a separate disk or hostPath to the container. It is a common practice to provide persistent storage to containers in this way for storing logs or other essential information. Since the volume mounts are treated as separate disks, they can be used to create OverlayFS. The newly created OverlayFS resides outside the container layer, allowing attackers to avoid the problems of the first approach.
Let’s create a non-root privileged container and mount a writable hostPath volume (/tmp). Then, we will use the exploit to try to escalate privileges. Following Kubernetes, YAML can be used to schedule a pod on the vulnerable Kubernetes host.
apiVersion: v1
kind: Pod
metadata:
name: hostpath
namespace: default
spec:
containers:
- name: bad
image: manojahuje/ubuntu:jammy #required packages preinstalled
command: ["sleep", "3600"]
securityContext:
privileged: true
runAsNonRoot: true
runAsUser: 1000
runAsGroup: 2000
imagePullPolicy: IfNotPresent
volumeMounts:
- mountPath: /host
name: host
volumes:
- name: host
hostPath:
path: /tmp
type: Directory
Figure 6.A below shows the flow for the exploit to achieve container root using both vulnerabilities. Figure 6.B shows privilege escalation being successful and the non-root user achieving container root privileges.
At this point, the attacker has the root of a non-root container on a vulnerable Ubuntu node. Both vulnerabilities can be used to gain root access on non-root privileged containers. Attackers can now attempt to escape the container via traditional methods using the CGROUP exploit or mounting the host hard drive to the privileged container as per the attack surface available on the container.
CrowdStrike Protects Against Both Vulnerabilities on Hosts and Containers
CrowdStrike Falcon Cloud Security detects and mitigates any exploitation attempt using either vulnerability on hosts as well as containers (Kubernetes or Docker). As shown in Figure 7, Falcon Cloud Security detects the vulnerable container and shows the process exploit tree to pinpoint exploitation attempts by an attacker.
CrowdStrike Falcon Cloud Security finds vulnerability exposure in your environment by detecting any vulnerable Ubuntu assets in your organization whether it’s on-premises or in the cloud. Figure 8 shows vulnerabilities detected in multiple assets in the organization and potential remediation.
Mitigation
CrowdStrike recommends the following steps to mitigate the vulnerabilities:
- Upgrade Ubuntu nodes to a patched kernel version
- Actively monitor and detect non-root privileged containers on vulnerable nodes
- Use Seccomp or AppArmour profiles to block the use of
unshare
The heart of these vulnerabilities is the ability for unprivileged users to create a new namespace. This can be disabled on Ubuntu nodes via the following commands. CrowdStrike recommends testing the configuration to avoid any impact.
sudo sysctl -w kernel.unprivileged_userns_clone=0
For persistent change, you can use the following command:
echo kernel.unprivileged_userns_clone=0 | \
sudo tee /etc/sysctl.d/99-disable-userns.conf
Additional Resources
- To learn more about cloud vulnerabilities and how you can protect your organization, join us at Fal.Con 2023, the can’t-miss cybersecurity experience of the year. Register now and meet us in Las Vegas, Sept. 18-21!
- Learn how you can stop cloud breaches with CrowdStrike unified cloud security for multi-cloud and hybrid environments — all in one lightweight platform.
- See for yourself how the industry-leading CrowdStrike Falcon platform protects against modern threats. Start your 15-day free trial today.