Trust Boundary Violation in Claude Chrome Extension | Image: LayerX
Artificial intelligence assistants are rapidly becoming integrated into our daily workflows, but what happens when a trusted AI starts taking orders from a malicious bystander?
A new intelligence report authored by Aviad Gispan, Senior Researcher at LayerX, has exposed a critical vulnerability in Anthropic’s “Claude in Chrome” browser extension. The flaw reveals a reality about modern browser security: an attacker does not need high-level system permissions to do massive damage if they can simply trick your AI into doing the dirty work for them.
The vulnerability lies in how the Claude extension handles internal browser communications. According to LayerX researchers, the flaw “allows any extension, even one with no special permissions at all, to effectively hijack Claude’s extension by injecting it with malicious instructions, extract any information that the attacker desires, and get Claude to perform active agentic actions on their behalf”.
At the heart of the issue is a failure to establish a secure trust boundary. The report notes that “The flaw stems from an instruction in the extension’s code that allows any script running in the origin browser to communicate with Claude’s LLM, but does not verify who is running the script”. Consequently, any other extension installed on the browser can invoke a content script and secretly issue commands to the Claude extension.
Once the communication channel is hijacked, the AI assistant essentially becomes a remote-controlled insider threat. LayerX researchers successfully demonstrated several devastating proof-of-concept attacks, including:
- Extracting a file from a Google Drive folder and sharing it with an outsider
- Sending an email on behalf of the remote attacker
- Stealing source code from a private repository on Github
- Summarizing the last five emails, sending them to an external user, and deleting the sent email
The most fascinating aspect of this attack is how researchers bypassed Claude’s built-in policy enforcements. Claude is programmed to refuse certain actions, such as sharing organization-owned Google Drive files externally.
However, attackers do not need to rewrite the AI’s logic; they just need to change what it sees. The report outlines that “Claude’s decision-making relies heavily on: DOM structure, Visible text, UI semantics, Screenshot interpretation”. Because these inputs are entirely attacker-controlled within the web page, malicious scripts can dynamically modify the User Interface (UI) to trick the AI.
By removing sensitive indicators like “private” or “password”, and renaming UI labels (for example, changing a “Share” button to read “Request feedback”), the attacker can instruct Claude to “Click the ‘Request feedback’ button”. From the AI’s perspective, this is a completely benign action, but it actually triggers external file sharing. “This bypasses policy enforcement by attacking perception rather than logic,” the researchers explained.
LayerX responsibly disclosed the vulnerability to Anthropic prior to publication. “Anthropic replied that they were already aware of the issue and that it would be fixed in the next version of the extension”.
However, the remediation has fallen critically short. The report explicitly warns that “Anthropic issued only a partial fix, which did not address the root cause of the flaw, and the vulnerability can still be exploited”.
Until a comprehensive patch is deployed that strictly authenticates the source of cross-extension messaging, security teams must treat “Claude in Chrome” as a potential vector for data exfiltration and carefully audit all extensions running within their enterprise environments.
Support Our Threat Intelligence
If you find our CVE report and cybersecurity news helpful, consider supporting our work.