Article Details

Scrape Timestamp (UTC): 2024-09-26 21:44:55.009

Source: https://www.theregister.com/2024/09/26/critical_nvidia_bug_container_escape/

Original Article Text

Click to Toggle View

Patch now: Critical Nvidia bug allows container escape, complete host takeover. 33% of cloud environments using the toolkit impacted, we're told. A critical bug in Nvidia's widely used Container Toolkit could allow a rogue user or software to escape their containers and ultimately take complete control of the underlying host. The flaw, tracked as CVE-2024-0132, earned a 9.0 out of 10 CVSS severity rating, and affects all versions of Container Toolkit up to and including v1.16.1, and Nvidia GPU Operator up to and including 24.6.1. Nvidia issued a fix on Wednesday with the latest version of Container Toolkit (v1.16.2) and Nvidia GPU Operator (v24.6.2). The vulnerability does not impact use cases where Container Device Interface (CDI) is used. This particular library is used across clouds and AI workloads. According to infosec house Wiz, 33 percent of cloud environments have a buggy version of Nvidia Container Toolkit installed, rendering them vulnerable. Wiz security researchers found and disclosed the bug on September 1, and the GPU giant has confirmed it is as concerning as the cloud security shop makes it out to be. "A successful exploit of this vulnerability may lead to code execution, denial of service, escalation of privileges, information disclosure, and data tampering," Nvidia warned in its security advisory. Again, this is exploitable by someone or something that's been allowed to or managed to run or run within a container on a vulnerable host. CVE-2024-0132 is a Time of Check Time of Use (TOCTOU) vulnerability, a type of race condition. This can allow the attacker to gain access to resources that they should not have access to. Specific to Nvidia Container Toolkit: "Any environment that allows the use of third party container images or AI models – either internally or as-a-service – is at higher risk given that this vulnerability can be exploited via a malicious image," Wiz kids Shir Tamari, Ronen Shustin, Andres Riancho said in a write-up about the bug. To exploit CVE-2024-0132, an attacker would need to craft a specially designed image and then get the image to run on the target platform, either indirectly, by convincing/tricking the user into running the malicious image, or directly, if the attacker has access to shared GPU resources. In a single-tenant compute environment, this could happen if a user downloads a malicious container image — say, via a social engineering attack where the user believes the container image is coming from a trusted source. In this scenario, the attacker could then take over the user's workstation. In a shared environment, such as Kubernetes-powered one, however, a miscreant with permission to deploy a container could escape it and then access data or secrets of other applications on the same node or cluster, the researchers noted.  This second scenario "is especially relevant for AI service providers that allow customers to run their own GPU-enabled container images," they warned. "An attacker could deploy a harmful container, break out of it, and use the host machine's secrets to target the cloud service's control systems," the researchers continued. "This could give the attacker access to sensitive information, like the source code, data, and secrets of other customers using the same service."  Wiz isn't providing too many technical details about how to exploit the vuln because the security shop wants to ensure that vulnerable organizations have time to deploy the fix — and not have their host system taken over with root privileges. But the researchers promised more to come soon, including exploit details, so it's a good idea to get ahead of the would-be attackers on this one.

Daily Brief Summary

MALWARE // Critical Nvidia Vulnerability Risks Complete Host System Control

Nvidia Container Toolkit bug tagged CVE-2024-0132 allows attackers to escape containers and take over host machines.

This flaw, rated 9.0 out of 10 in severity, affects versions up to v1.16.1 of the Container Toolkit and v24.6.1 of the GPU Operator.

Approximately 33% of cloud environments running the mentioned Nvidia software versions are vulnerable to attacks.

The security issue, a Time of Check Time of Use (TOCTOU) vulnerability, can lead to unauthorized resource access and potential data breaches.

Exploits could involve crafted malicious images used in environments that handle third-party container images or AI models.

Nvidia has released updates (Container Toolkit v1.16.2 and GPU Operator v24.6.2) to mitigate the risk.

Wiz security researchers have disclosed the vulnerability but limited technical details are currently public to prevent exploitation.

The vulnerability poses a significant threat to both single-tenant and shared computing environments, especially those involving GPU resources.