Article Details

Scrape Timestamp (UTC): 2025-11-28 17:46:38.453

Source: https://www.bleepingcomputer.com/news/security/public-gitlab-repositories-exposed-more-than-17-000-secrets/

Original Article Text

Click to Toggle View

Public GitLab repositories exposed more than 17,000 secrets. After scanning all 5.6 million public repositories on GitLab Cloud, a security engineer discovered more than 17,000 exposed secrets across over 2,800 unique domains. Luke Marshall used the TruffleHog open-source tool to check the code in the repositories for sensitive credentials like API keys, passwords, and tokens. The researcher previously scanned Bitbucket, where he found 6,212 secrets spread over 2.6 million repositories. He also checked the Common Crawl dataset that is used to train AI models, which exposed 12,000 valid secrets. GitLab is a web-based Git platform used by software developers, maintainers, and DevOps teams to host code, for CI/CD operations, development collaboration, and repository management. Marshall used a GitLab public API endpoint to enumerate every public GitLab Cloud repository, using a custom Python script to paginate through all results and sort them by project ID. This process returned 5.6 million non-duplicate repositories, and their names were sent to an AWS Simple Queue Service (SQS). Next, an AWS Lambda function pulled the repository name from SQS, ran TruffleHog against it, and logged the results. “Each Lambda invocation executed a simple TruffleHog scan command with concurrency set to 1000,” describes Marshall. “This setup allowed me to complete the scan of 5,600,000 repositories in just over 24 hours.” The total cost for the entire public GitLab Cloud repositories using the above method was $770. The researcher found 17,430 verified live secrets, nearly three times as many as in Bitbucket, and with a 35% higher secret density (secrets per repository), too. Historical data shows that most leaked secrets are newer than 2018. However, Marshall also found some very older secrets dating from 2009, which are still valid today. The largest number of leaked secrets, over 5,200 of them, were Google Cloud Platform (GCP) credentials, followed by MongoDB keys, Telegram bot tokens, and OpenAI keys. The researcher also found a little over 400 GitLab keys leaked in the scanned repositories. In the spirit of responsible disclosure and because the discovered secrets were associated with 2,804 unique domains, Marshall relied on automation to notify affected parties and used Claude Sonnet 3.7 with web search ability and a Python script to generate emails. In the process, the researcher collected multiple bug bounties that amounted to $9,000. The researcher reports that many organizations revoked their secrets in response to his notifications. However, an undisclosed number of secrets continue to be exposed on GitLab. Secrets Security Cheat Sheet: From Sprawl to Control Whether you're cleaning up old keys or setting guardrails for AI-generated code, this guide helps your team build securely from the start. Get the cheat sheet and take the guesswork out of secrets management.

Daily Brief Summary

DATA BREACH // Over 17,000 Secrets Exposed in Public GitLab Repositories

•

A security engineer discovered over 17,000 exposed secrets across 5.6 million public GitLab repositories, affecting more than 2,800 unique domains.

•

The engineer employed TruffleHog, an open-source tool, to identify sensitive credentials like API keys, passwords, and tokens within the repositories.

•

The scan revealed a significant presence of Google Cloud Platform credentials, followed by MongoDB keys, Telegram bot tokens, and OpenAI keys.

•

The process utilized GitLab's public API and AWS services, completing the scan in just over 24 hours at a cost of $770.

•

The researcher responsibly disclosed the findings to affected parties using automated notifications, resulting in multiple bug bounties totaling $9,000.

•

Despite many organizations revoking their exposed secrets, some remain vulnerable, underscoring ongoing risks in secrets management.

•

Historical data from the scan indicates that most leaked secrets are from post-2018, though some date back to 2009 and are still valid.

•

This incident highlights the critical need for robust secrets management practices and proactive security measures in software development environments.