Critical Triton Flaws (CVSS 9.8) Expose AI Servers to Remote Takeover

NVIDIA has released urgent software updates to address a set of critical vulnerabilities discovered in its popular Triton Inference Server, a widely used open-source AI serving platform. The flaws, reported in collaboration with Wiz Research, expose systems to remote code execution (RCE), data tampering, and denial-of-service (DoS) attacks — all without requiring user interaction or authentication.

At the heart of the discovery lies a three-stage vulnerability chain, assigned CVE-2025-23319, CVE-2025-23320, and CVE-2025-23334, that allows remote attackers to seize control of Triton servers by exploiting a misconfigured shared memory system in the Python backend.

In their technical breakdown, Wiz Research explains how the exploit chain begins with a seemingly minor error-handling flaw in the Python backend. By submitting a malformed request, attackers can provoke an error message that leaks the unique name of the backend’s internal shared memory region:

“The returned error message appears as follows: {‘error’:’Failed to increase the shared memory pool size for key ‘triton_python_backend_shm_region_4f50c226-b3d0-46e8-ac59-d4690b28b859’…’}“

This leak sets the stage for a deeper compromise. Using Triton’s legitimate shared memory APIs — intended to optimize inference performance — attackers can register and manipulate the backend’s private memory space, bypassing internal isolation mechanisms.

“This provides the attacker with powerful read and write primitives into the Python backend’s private memory… performed through standard, legitimate API calls,” Wiz Research explains.

The final stage involves exploiting this memory access to execute arbitrary code. By corrupting internal data structures, injecting malicious messages into the Inter-Process Communication (IPC) queue, or triggering out-of-bounds memory accesses, an attacker could achieve full remote control of the server.

NVIDIA confirmed multiple vulnerabilities in its bulletin — with three scoring a CVSS 9.8 (“Critical”), indicating maximum exploitability and impact:

CVE-2025-23310: Stack buffer overflow through crafted input, leading to RCE, DoS, info disclosure
CVE-2025-23311: Stack overflow via HTTP requests
CVE-2025-23317: Remote shell via crafted HTTP request

Each of these can be triggered remotely, requires no authentication, and poses a serious threat to any organization deploying AI/ML pipelines with Triton.

NVIDIA explicitly states:

“A successful exploit of this vulnerability might lead to remote code execution, denial of service, information disclosure, and data tampering.”

Triton is frequently deployed in production AI environments, including in data centers, edge servers, and enterprise inference pipelines. An attacker exploiting these vulnerabilities could:

Steal proprietary AI models
Alter inference results, compromising trust in machine learning outputs
Exfiltrate sensitive training data
Use the compromised server as a foothold into internal networks

Nvidia has responded swiftly, releasing patches across three Triton versions:

25.05: Fixes CVE-2025-23323 to CVE-2025-23327, and CVE-2025-23335
25.06: Fixes CVE-2025-23322 and CVE-2025-23331
25.07: Fixes CVE-2025-23310, CVE-2025-23311, CVE-2025-23317, and CVE-2025-23318

You can access the updates on the Triton Inference Server GitHub page.

Support Our Threat Intelligence

If you find our CVE report and cybersecurity news helpful, consider supporting our work.

Buy Me a Coffee PayPal

Written by

@DdoS · Security Researcher

Do Son

Do Son is the Founder and Editor of SecurityOnline.info. Working in cybersecurity since 2013, he reports on vulnerabilities, malware, and emerging threats, providing timely analysis to help organizations and individuals stay ahead of evolving risks.

Related Posts:

Get Zero-Hour Vulnerability Alerts

Support Our Threat Intelligence

Do Son

Leave a Reply Cancel reply