CVE-2026-33626: High-Severity SSRF Exploited in the Wild to Hijack AI Inference Engines

Do Son April 23, 2026 3 minutes read

On April 21, 2026, a high-severity Server-Side Request Forgery (SSRF) vulnerability was disclosed in LMDeploy, a popular toolkit for serving vision-language and large language models (LLMs). Within a staggering 12 hours and 31 minutes of the advisory going live, the Sysdig Threat Research Team (TRT) observed the first active exploitation attempt against their honeypots.

The flaw, tracked as CVE-2026-33626, proves that in the age of generative AI, an advisory text alone is “enough to craft an exploit” from scratch, even without public proof-of-concept code.

LMDeploy is designed to serve advanced models that can “see”—like InternVL2 and Qwen2-VL. When a user sends a chat request with an image URL, the server fetches that image to process it. However, the vulnerable versions (v0.12.2 and earlier) lacked critical security checks.

“This code lacks a hostname resolution check, private-network blocklist, and protection for link-local addresses,” the report explains.

By providing a malicious URL instead of a real image link, an attacker can trick the server into reaching out to internal resources it should never touch.

The observed attack wasn’t a random probe; it was a surgical, scripted eight-minute session that used the image loader as a “generic HTTP SSRF primitive”. The exploitation unfolded in three distinct phases:

Phase 1: Cloud & Cache Probing. The attacker immediately targeted the AWS Instance Metadata Service (IMDS) to attempt to steal IAM credentials. They also checked for Redis on the standard port 6379, a common target for post-exploitation.
Phase 2: Egress & API Mapping. The attacker used an out-of-band (OOB) DNS callback to confirm the server could reach the external internet. They then requested /openapi.json to find hidden administrative endpoints.
Phase 3: Administrative Sabotage. In a sophisticated move, the attacker probed a “distributed-serving kill-switch” at /distserve/p2p_drop_connect, which could be used to disrupt AI inference across a cluster.

AI infrastructure is being weaponized faster than traditional software. As the report warns, “Attackers are no longer waiting for mass-exploitation tools. The advisory text, read carefully, is enough to craft an exploit.”

Because AI nodes typically run on high-powered GPU instances with broad cloud permissions, a single successful SSRF can lead to a “complete compromise of the cloud account”.

To stay ahead of this collapsing timeline, Sysdig recommends several high-ROI (Return on Investment) security controls:

Patch Immediately: Update LMDeploy to v0.12.3 or later.
Harden IMDS: Enforce IMDSv2 and set httpTokens to required. This prevents a simple SSRF from stealing tokens since it cannot perform the required initial PUT request.
Strict Egress Controls: Restrict outbound traffic at the VPC level so inference nodes can only reach trusted storage like S3 or GCS.
Monitor Internal Probes: Audit any outbound connections from inference processes to loopback or private (RFC 1918) addresses, as these “should be zero in normal operation”.

Support Our Threat Intelligence

If you find our CVE report and cybersecurity news helpful, consider supporting our work.

Buy Me a Coffee PayPal

Written by

@DdoS · Security Researcher

Do Son

Do Son is the Founder and Editor of SecurityOnline.info. Working in cybersecurity since 2013, he reports on vulnerabilities, malware, and emerging threats, providing timely analysis to help organizations and individuals stay ahead of evolving risks.

Support Our Threat Intelligence

Do Son

Leave a Reply Cancel reply