Original Article Text

Click to Toggle View

Critical flaw in NVIDIA Container Toolkit allows full host takeover. A critical vulnerability in NVIDIA Container Toolkit impacts all AI applications in a cloud or on-premise environment that rely on it to access GPU resources. The security issue is tracked as CVE-2024-0132 and allows an adversary to perform container escape attacks and gain full access to the host system, where they could execute commands or exfiltrate sensitive information. The particular library comes pre-installed in many AI-focused platforms and virtual machine images and is the standard tool for GPU access when NVIDIA hardware is involved. According to Wiz Research, more than 35% of cloud environments are at risk of attacks exploiting the vulnerability. Container escape flaw The security issue CVE-2024-0132 received a critical-severity score of 9.0. It is a container escape problem that affects NVIDIA Container Toolkit 1.16.1 and earlier, and GPU Operator 24.6.1 and older. The problem is a lack of secure isolation of the containerized GPU from the host, allowing containers to mount sensitive parts of the host filesystem or access runtime resources like Unix sockets for inter-process communication. While most filesystems are mounted with “read-only” permissions, certain Unix sockets such as ‘docker.sock’ and ‘containerd.sock’ remain writable, allowing direct interactions with the host, including command execution. An attacker can take advantage of this omission via a specially crafted container image and reach the host when executed. Wiz says that such an attack could be carried out either directly, via shared GPU resources, or indirectly, when the target runs an image downloaded from a bad source. Wiz researchers discovered the vulnerability and reported it to NVIDIA on September 1st. The GPU maker acknowledged the report a couple of days later, and released a fix on September 26th. Impacted users are recommended to upgrade to NVIDIA Container Toolkit version 1.16.2 and NVIDIA GPU Operator 24.6.2. Technical details for the exploiting the security issue remain private for now, to give impacted organizations time to mitigate the issue in their environments. However, the researchers are planning to release more technical information.

Daily Brief Summary

CYBERCRIME // Critical NVIDIA Toolkit Flaw Threatens Cloud AI Applications

A severe vulnerability in NVIDIA Container Toolkit, CVE-2024-0132, permits potential container escape attacks, allowing attackers full access to the host system.

Impacts all AI platforms using the toolkit for GPU resource management in cloud or on-premise environments, affecting upwards of 35% of these environments.

Attack vectors include exploitation via shared GPU resources or malicious container images sourced from untrusted origins.

Vulnerable versions include NVIDIA Container Toolkit up to v1.16.1 and GPU Operator up to v24.6.1; all users urged to upgrade to patched versions immediately.

Researchers from Wiz Research identified and reported the flaw to NVIDIA, leading to a prompt acknowledgment and subsequent fix by the manufacturer.

Technical details regarding the exploitation of this flaw are withheld to allow time for widespread mitigation among affected services.