Data Exfiltration and RCE Risks Found in Azure Data Factory’s Airflow Integration
Unit 42 researchers have uncovered multiple vulnerabilities in Azure Data Factory’s managed Apache Airflow integration, potentially enabling attackers to achieve shadow administrator control, data exfiltration, and remote code execution.
Apache Airflow, integrated into Azure Data Factory, is widely used to orchestrate workflows through Directed Acyclic Graphs (DAGs). However, Unit 42 researchers identified vulnerabilities that could allow attackers to manipulate DAG files, bypass Kubernetes security controls, and exploit Azure’s internal systems.
The vulnerabilities include:
- Misconfigured Kubernetes RBAC: Improper role-based access control allows attackers to escalate privileges within the cluster.
- Misconfigured Geneva Service Secrets: Geneva, an Azure internal service for monitoring, exposes sensitive credentials.
- Weak Authentication for Geneva: Attackers can tamper with logs and access Azure resources through poorly secured Geneva API endpoints.
Unit 42 highlights the critical risk: “The vulnerabilities can provide attackers with shadow admin control over Azure infrastructure, which could lead to data exfiltration, malware deployment and unauthorized data access.”
The attack begins with gaining access to or tampering with DAG files. Researchers demonstrated how an attacker could craft a malicious DAG to execute a reverse shell. Once imported into the Airflow cluster, the DAG runs automatically, opening the door for further exploitation.
Unit 42 explains the attack flow:
- Step 1: Gain write access to DAG files (via leaked credentials or shared access tokens).
- Step 2: Upload a malicious DAG that establishes a reverse shell connection.
- Step 3: Exploit Kubernetes misconfigurations to escalate to cluster admin privileges.
With these privileges, attackers can:
- Deploy privileged pods to escape container isolation.
- Access sensitive secrets (e.g., PostgreSQL passwords, TLS certificates, and Azure storage keys).
- Gain root access to the host virtual machine (VM) running the Kubernetes cluster.
As Unit 42 notes: “The service account used by the pod had cluster admin permissions, giving us full control over the entire cluster. These permissions included creating pods, accessing secrets, and more.”
One of the most alarming discoveries involves Azure’s Geneva service, a critical monitoring tool for Microsoft infrastructure. Researchers found that Geneva uses weak authentication mechanisms and shared certificates across Airflow deployments.
Exploiting Geneva could allow attackers to:
- Tamper with Logs: Attackers can generate and send crafted logs to Geneva, hiding malicious activity.
- Access Azure Resources: Geneva’s exposed API endpoints provide access to Event Hubs, storage accounts, and more.
- Exfiltrate Data: Attackers can manipulate Geneva to extract sensitive information or disrupt operations.
Unit 42 explains the impact: “These vulnerabilities could enable attackers to escape from their pods, gain unauthorized administrative control over clusters and access Azure’s internal services (Geneva).”
If exploited, these vulnerabilities could lead to devastating consequences:
- Shadow Workloads: Attackers can deploy hidden workloads, such as cryptominers or malware, without detection.
- Persistent Access: By creating new service accounts, attackers maintain long-term control over the cluster.
- Data Exfiltration: Attackers could steal credentials, modify DNS zones, and access databases connected to Airflow.
Microsoft has acknowledged the reported vulnerabilities, categorizing them as low-severity due to isolated clusters. However, Unit 42 emphasizes the importance of proper configurations to prevent exploitation.
Key recommendations include:
- Restrict DAG File Access: Use strict permissions to prevent unauthorized modifications.
- Secure Kubernetes RBAC: Apply least-privilege principles to Kubernetes service accounts.
- Harden Secrets Management: Safeguard access to sensitive keys and certificates.
- Monitor Logs and Policies: Use auditing tools to detect unusual behavior within Airflow and Geneva services.
The “Dirty DAG” vulnerabilities shed light on the risks of misconfigurations in managed cloud services like Azure Data Factory. Unit 42’s findings demonstrate how attackers can chain seemingly minor flaws to achieve full control over Kubernetes clusters and Azure resources.
As cloud environments grow in complexity, organizations must adopt a layered security approach to detect and mitigate such threats. Unit 42 concludes: “Adversaries have moved beyond basic tactics to more sophisticated service-specific attacks. It is essential to adopt a comprehensive protection strategy.”