
Fake OpenAI and ChatGPT Website
A newly disclosed vulnerability, dubbed “Time Bandit,” has been discovered in ChatGPT-4o, allowing attackers to bypass safety restrictions and generate illicit or dangerous content.
The vulnerability, detailed in a CERT/CC Vulnerability Note, involves prompting the AI with questions about a specific time period in history, creating confusion that enables attackers to circumvent safety guidelines and generate responses that violate the AI’s ethical boundaries.
“Once this historical timeframe [has been] established in the ChatGPT conversation, the attacker can exploit time line confusion and procedural ambiguity in following prompts to circumvent the safety guidelines, resulting in ChatGPT generating illicit content,” the note explains.
The vulnerability can be exploited in two ways: directly prompting the AI or using the “Search” function to query historical information and then pivoting to illicit topics. In both cases, the attacker leverages procedural ambiguity and maintains the established historical context to avoid detection.
The impact of this vulnerability is significant, as it could be exploited by malicious actors to generate harmful content, including instructions for creating weapons or drugs, phishing emails, or malware.
OpenAI has addressed this vulnerability and stated: “It is very important to us that we develop our models safely. We don’t want our models to be used for malicious purposes. We appreciate you for disclosing your findings. We’re constantly working to make our models safer and more robust against exploits, including jailbreaks, while also maintaining the models’ usefulness and task performance.”