NVIDIA has released a series of critical software updates to address high-severity vulnerabilities across its core AI and machine learning frameworks. The patches cover Megatron-LM, Triton Inference Server, Model Optimizer, and the NeMo Framework, many of which were susceptible to Remote Code Execution (RCE) and significant Denial of Service (DoS) attacks.
As AI models become central to enterprise operations, these vulnerabilities highlight a growing attack surface in the tools used to build and deploy them.
NVIDIA’s large-scale language model framework, Megatron-LM, and its conversational AI toolkit, NeMo Framework, are both facing serious deserialization risks. Researchers identified multiple flaws (including CVE-2025-33247 and CVE-2026-24157) where attackers could exploit how these systems load configuration data and model checkpoints.
The security bulletins warn that these flaws typically involve a CWE-502 (Deserialization of Untrusted Data) vulnerability:
- Megatron-LM: “NVIDIA Megatron LM contains a vulnerability in quantization configuration loading, which could allow remote code execution”.
- NeMo Framework: “NVIDIA NeMo Framework contains a vulnerability in checkpoint loading where an attacker could cause remote code execution”.
If successfully exploited, these vulnerabilities “might lead to code execution, escalation of privileges, information disclosure, and data tampering”.
The Triton Inference Server, used for deploying AI models in production, was found to be vulnerable to three distinct High-severity issues that could be triggered over the network without authentication (CVSS 7.5).
Attackers could target the server’s HTTP endpoints to disrupt services:
- State Corruption: “NVIDIA Triton Inference Server contains a vulnerability where an attacker could cause internal state corruption”.
- Resource Exhaustion: One specific flaw (CVE-2026-24158) allows a DoS attack “by providing a large compressed payload”.
- Sagemaker Vulnerability: The Sagemaker HTTP server variant “contains a vulnerability where an attacker could cause an exception,” leading to an immediate denial of service.
The NVIDIA Model Optimizer for Windows and Linux was patched for a critical flaw in its ONNX quantization feature (CVE-2026-24141). Like its counterparts, this vulnerability stems from unsafe data handling.
According to the bulletin: “A user could cause unsafe deserialization by providing a specially crafted input file.”
NVIDIA has released official updates for all affected products. Administrators are urged to transition to the updated versions listed below immediately to protect their systems.
Support Our Threat Intelligence
If you find our CVE report and cybersecurity news helpful, consider supporting our work.