Article Details
Scrape Timestamp (UTC): 2024-01-18 12:36:54.903
Source: https://thehackernews.com/2024/01/tensorflow-cicd-flaw-exposed-supply.html
Original Article Text
Click to Toggle View
TensorFlow CI/CD Flaw Exposed Supply Chain to Poisoning Attacks. Continuous integration and continuous delivery (CI/CD) misconfigurations discovered in the open-source TensorFlow machine learning framework could have been exploited to orchestrate supply chain attacks. The misconfigurations could be abused by an attacker to "conduct a supply chain compromise of TensorFlow releases on GitHub and PyPi by compromising TensorFlow's build agents via a malicious pull request," Praetorian researchers Adnan Khan and John Stawinski said in a report published this week. Successful exploitation of these issues could permit an external attacker to upload malicious releases to the GitHub repository, gain remote code execution on the self-hosted GitHub runner, and even retrieve a GitHub Personal Access Token (PAT) for the tensorflow-jenkins user. TensorFlow uses GitHub Actions to automate the software build, test, and deployment pipeline. Runners, which refer to machines that execute jobs in a GitHub Actions workflow, can be either self-hosted or hosted by GitHub. "We recommend that you only use self-hosted runners with private repositories," GitHub notes in its documentation. "This is because forks of your public repository can potentially run dangerous code on your self-hosted runner machine by creating a pull request that executes the code in a workflow." Put differently, this allows any contributor to execute arbitrary code on the self-hosted runner by submitting a malicious pull request. This, however, does not pose any security concern with GitHub-hosted runners, as each runner is ephemeral and is a clean, isolated virtual machine that's destroyed at the end of the job execution. Praetorian said it was able to identify TensorFlow workflows that were executed on self-hosted runners, subsequently finding fork pull requests from previous contributors that automatically triggered the appropriate CI/CD workflows without requiring approval. An adversary looking to trojanize a target repository could, therefore, fix a typo or make a small but legitimate code change, create a pull request for it, and then wait until the pull request is merged in order to become a contributor. This would then enable them to execute code on the runner sans raising any red flag by creating a rogue pull request. Further examination of the workflow logs revealed that the self-hosted runner was not only non-ephemeral (thus opening the door for persistence), but also that the GITHUB_TOKEN permissions associated with the workflow came with extensive write permissions. "Because the GITHUB_TOKEN had the Contents:write permission, it could upload releases to https://github[.]com/tensorflow/tensorflow/releases/," the researchers said. "An attacker that compromised one of these `GITHUB_TOKEN's could add their own files to the Release Assets." On top of that, the contents:write permissions could be weaponized to push code directly to the TensorFlow repository by covertly injecting the malicious code into a feature branch and getting it merged into the main branch. That's not all. A threat actor could steal the AWS_PYPI_ACCOUNT_TOKEN used in the release workflow to authenticate to the Python Package Index (PyPI) registry and upload a malicious Python .whl file, effectively poisoning the package. "An attacker could also use the GITHUB_TOKEN's permissions to compromise the JENKINS_TOKEN repository secret, even though this secret was not used within workflows that ran on the self-hosted runners," the researchers said. Following responsible disclosure on August 1, 2023, the shortcomings were addressed by the project maintainers as of December 20, 2023, by requiring approval for workflows submitted from all fork pull requests and by changing the GITHUB_TOKEN permissions to read-only for workflows that ran on self-hosted runners. "Similar CI/CD attacks are on the rise as more organizations automate their CI/CD processes," the researchers said. "AI/ML companies are particularly vulnerable as many of their workflows require significant compute power that isn't available in GitHub-hosted runners, thus the prevalence of self-hosted runners." The disclosure comes as both researchers revealed that several public GitHub repositories, including those associated with Chia Networks, Microsoft DeepSpeed, and PyTorch, are susceptible to malicious code injection via self-hosted GitHub Actions runners. Report: Unveiling the Threat of Malicious Browser Extensions Download the Report to learn the Risks of Malicious Extensions and How to Mitigate Them. SaaS Security Masterclass: Insights from 493 Companies Watch this webinar to discover Critical SaaS Security Do's and Don'ts based on a study of 493 companies, offering real-world comparisons and benchmarks.
Daily Brief Summary
Critical CI/CD misconfigurations were found in the open-source machine learning framework TensorFlow, which could have allowed supply chain attacks.
Attackers could have compromised TensorFlow’s GitHub and PyPi releases or gained remote code execution abilities via a malicious pull request.
An external attacker had the potential to gain access to a GitHub Personal Access Token (PAT) and upload malicious code to the TensorFlow repository.
The security flaw was due to the use of self-hosted GitHub runners with public repositories, which can execute arbitrary code from a pull request without explicit approval.
Security researchers from Praetorian identified non-ephemeral self-hosted runners and overly permissive GITHUB_TOKEN's, leading to extensive privilege escalation possibilities.
Among the risks was the ability to push malicious code updates or poison the Python package registry with a tainted .whl file.
TensorFlow maintainers have fixed the vulnerabilities by introducing approval requirements for fork pull requests and setting read-only permissions for GITHUB_TOKEN in self-hosted runner workflows.
The incident highlights a growing trend of similar CI/CD-related cyber threats, with AI/ML companies at particular risk due to heavy reliance on self-hosted runners for their resource-intensive workflows.