Reformatted output showing extracted information | Image: Unit 42
As organizations race to integrate autonomous systems into their workflows, a new and subtle threat is emerging from the very tools designed to help them. Recent research from Palo Alto Networks’ Unit 42 has pulled back the curtain on how AI agents—specifically those deployed on Google Cloud Platform’s (GCP) Vertex AI—can be transformed into “double agents” that secretly work against their creators.
AI agents are increasingly powerful, capable of making independent decisions and interacting with various enterprise services. However, this independence makes them a high-value target for attackers. A compromised agent doesn’t just fail; it can become a persistent, trusted insider.
Unit 42’s investigation revealed a critical vulnerability in how these agents are permissioned by default. As the report warns:
“A misconfigured or compromised agent can become a ‘double agent’ that appears to serve its intended purpose, while secretly exfiltrating sensitive data, compromising infrastructure, and creating backdoors into an organization’s most critical systems”.
The core of the issue lies in Per-Project, Per-Product Service Agents (P4SA)—Google-managed accounts that allow services to access resources. Unit 42 discovered that these service agents are often granted excessive permissions by default.
By deploying a malicious agent built with the Google Cloud ADK, researchers were able to:
- Extract Credentials: The agent was used to pull service-agent credentials from Google’s internal metadata service.
- Exfiltrate Data: Using these stolen credentials, the researchers “effectively broke isolation” and gained unrestricted read access to all Google Cloud Storage Buckets within the consumer project.
- Access Permissions: The compromised agent could list and get both storage buckets and objects, a level of access described as a “significant security risk”.
Perhaps most alarming was the agent’s ability to reach beyond the customer’s project and into Google’s own infrastructure—the “producer” environment. The stolen credentials granted access to restricted, Google-owned Artifact Registry repositories.
Researchers successfully downloaded private container images that form the core of the Vertex AI Reasoning Engine. Unit 42 notes the gravity of this exposure:
“Gaining access to this proprietary code not only exposes Google’s intellectual property, but also provides an attacker with a blueprint to find further vulnerabilities”.
The report also identified structural weaknesses in the deployment process. Researchers found that AI agents are often packaged as Python pickle files, a format notorious for being insecure. An attacker manipulating these files could achieve Remote Code Execution (RCE), creating a “persistent and powerful backdoor”.
Furthermore, the default OAuth 2.0 scopes assigned to the Agent Engine were found to be “far too permissive,” potentially extending an attacker’s reach into an organization’s Google Workspace data, including Gmail, Drive, and Calendar.
In response to these findings, Google has collaborated with Unit 42 to update its documentation and transparency regarding how Vertex AI uses resources. The primary recommendation for organizations is to “Bring Your Own Service Account” (BYOSA), which allows for granular control and ensures the principle of least privilege.
Unit 42 concludes that AI security must now be treated with the same rigor as production code. Without validating permission boundaries and reviewing source integrity, organizations may be “inviting a new generation of double agents into the very heart” of their digital infrastructure.
Support Our Threat Intelligence
If you find our CVE report and cybersecurity news helpful, consider supporting our work.