Article Details
Scrape Timestamp (UTC): 2025-11-05 14:06:26.797
Source: https://thehackernews.com/2025/11/researchers-find-chatgpt.html
Original Article Text
Click to Toggle View
Researchers Find ChatGPT Vulnerabilities That Let Attackers Trick AI Into Leaking Data. Cybersecurity researchers have disclosed a new set of vulnerabilities impacting OpenAI's ChatGPT artificial intelligence (AI) chatbot that could be exploited by an attacker to steal personal information from users' memories and chat histories without their knowledge. The seven vulnerabilities and attack techniques, according to Tenable, were found in OpenAI's GPT-4o and GPT-5 models. OpenAI has since addressed some of them. These issues expose the AI system to indirect prompt injection attacks, allowing an attacker to manipulate the expected behavior of a large language model (LLM) and trick it into performing unintended or malicious actions, security researchers Moshe Bernstein and Liv Matan said in a report shared with The Hacker News. The identified shortcomings are listed below - The disclosure comes close on the heels of research demonstrating various kinds of prompt injection attacks against AI tools that are capable of bypassing safety and security guardrails - The findings show that exposing AI chatbots to external tools and systems, a key requirement for building AI agents, expands the attack surface by presenting more avenues for threat actors to conceal malicious prompts that end up being parsed by models. "Prompt injection is a known issue with the way that LLMs work, and, unfortunately, it will probably not be fixed systematically in the near future," Tenable researchers said. "AI vendors should take care to ensure that all of their safety mechanisms (such as url_safe) are working properly to limit the potential damage caused by prompt injection." The development comes as a group of academics from Texas A&M, the University of Texas, and Purdue University found that training AI models on "junk data" can lead to LLM "brain rot," warning "heavily relying on Internet data leads LLM pre-training to the trap of content contamination." Last month, a study from Anthropic, the U.K. AI Security Institute, and the Alan Turing Institute also discovered that it's possible to successfully backdoor AI models of different sizes (600M, 2B, 7B, and 13B parameters) using just 250 poisoned documents, upending previous assumptions that attackers needed to obtain control of a certain percentage of training data in order to tamper with a model's behavior. From an attack standpoint, malicious actors could attempt to poison web content that's scraped for training LLMs, or they could create and distribute their own poisoned versions of open-source models. "If attackers only need to inject a fixed, small number of documents rather than a percentage of training data, poisoning attacks may be more feasible than previously believed," Anthropic said. "Creating 250 malicious documents is trivial compared to creating millions, making this vulnerability far more accessible to potential attackers." And that's not all. Another research from Stanford University scientists found that optimizing LLMs for competitive success in sales, elections, and social media can inadvertently drive misalignment, a phenomenon referred to as Moloch's Bargain. "In line with market incentives, this procedure produces agents achieving higher sales, larger voter shares, and greater engagement," researchers Batu El and James Zou wrote in an accompanying paper published last month. "However, the same procedure also introduces critical safety concerns, such as deceptive product representation in sales pitches and fabricated information in social media posts, as a byproduct. Consequently, when left unchecked, market competition risks turning into a race to the bottom: the agent improves performance at the expense of safety."
Daily Brief Summary
Cybersecurity researchers have discovered seven vulnerabilities in OpenAI's ChatGPT models, GPT-4o and GPT-5, which could be exploited to extract personal data from users.
These vulnerabilities enable indirect prompt injection attacks, allowing attackers to manipulate large language models into performing unintended actions.
Some vulnerabilities have been addressed by OpenAI, but systemic fixes for prompt injection issues remain elusive, posing ongoing risks.
The research highlights the expanded attack surface when AI chatbots interact with external tools, increasing opportunities for threat actors.
Studies suggest that training AI models on "junk data" can lead to degradation, while poisoning attacks on training data are more feasible than previously assumed.
The findings emphasize the need for robust safety mechanisms to prevent prompt injection and mitigate potential damage.
Concerns arise over market-driven optimization of AI models, which may compromise safety for competitive advantage, risking deceptive practices.