NVIDIA has issued a security bulletin regarding its Triton Inference Server, a cornerstone tool used by MLOps teams globally to deploy AI models at scale. The company has identified two high-severity vulnerabilities that could allow attackers to crash servers, effectively halting AI inference services.
The bulletin details two distinct issues (CVE-2025-33211 and CVE-2025-33201), both affecting the Linux version of the server. While the mechanics differ, the outcome is the same: a potential shutdown of service.
- CVE-2025-33211 The first flaw involves how the server processes specific input quantities. According to the security bulletin, the server “contains a vulnerability where an attacker may cause an improper validation of specified quantity in input.” Without proper validation, malicious inputs can destabilize the system.
- CVE-2025-33201 The second issue relates to how the server handles unexpected data sizes. The bulletin notes that the software suffers from an “improper check for unusual or exceptional conditions issue by sending extra large payloads”. Essentially, an attacker can overwhelm the system by sending more data than the server is designed to safely handle in a single request.
For both vulnerabilities, the consequences are clear: “A successful exploit of this vulnerability may lead to denial of service.”
If you are running NVIDIA Triton Inference Server on Linux, you need to check your version number immediately. NVIDIA has patched these holes in the latest release. Administrators are advised to update their containers and binaries to r25.10.
Related Posts:
- Critical Triton Flaws (CVSS 9.8) Expose AI Servers to Remote Takeover – Patch Now!
- Python-Powered Triton RAT Exfiltrates Data via Telegram and Evades Analysis
- PoC Published for Critical Nvidia Triton Inference Server Vulnerabilities
- NVIDIA Patches Critical RCE Flaw (CVE-2025-23316, CVSS 9.8) in Triton Inference Server