NVIDIA Patches Critical RCE and DoS Flaws Across ML Frameworks

Do Son March 26, 2026 2 minutes read

NVIDIA has released a series of critical software updates to address high-severity vulnerabilities across its core AI and machine learning frameworks. The patches cover Megatron-LM, Triton Inference Server, Model Optimizer, and the NeMo Framework, many of which were susceptible to Remote Code Execution (RCE) and significant Denial of Service (DoS) attacks.

As AI models become central to enterprise operations, these vulnerabilities highlight a growing attack surface in the tools used to build and deploy them.

NVIDIA’s large-scale language model framework, Megatron-LM, and its conversational AI toolkit, NeMo Framework, are both facing serious deserialization risks. Researchers identified multiple flaws (including CVE-2025-33247 and CVE-2026-24157) where attackers could exploit how these systems load configuration data and model checkpoints.

The security bulletins warn that these flaws typically involve a CWE-502 (Deserialization of Untrusted Data) vulnerability:

Megatron-LM: “NVIDIA Megatron LM contains a vulnerability in quantization configuration loading, which could allow remote code execution”.
NeMo Framework: “NVIDIA NeMo Framework contains a vulnerability in checkpoint loading where an attacker could cause remote code execution”.

If successfully exploited, these vulnerabilities “might lead to code execution, escalation of privileges, information disclosure, and data tampering”.

The Triton Inference Server, used for deploying AI models in production, was found to be vulnerable to three distinct High-severity issues that could be triggered over the network without authentication (CVSS 7.5).

Attackers could target the server’s HTTP endpoints to disrupt services:

State Corruption: “NVIDIA Triton Inference Server contains a vulnerability where an attacker could cause internal state corruption”.
Resource Exhaustion: One specific flaw (CVE-2026-24158) allows a DoS attack “by providing a large compressed payload”.
Sagemaker Vulnerability: The Sagemaker HTTP server variant “contains a vulnerability where an attacker could cause an exception,” leading to an immediate denial of service.

The NVIDIA Model Optimizer for Windows and Linux was patched for a critical flaw in its ONNX quantization feature (CVE-2026-24141). Like its counterparts, this vulnerability stems from unsafe data handling.

According to the bulletin: “A user could cause unsafe deserialization by providing a specially crafted input file.”

NVIDIA has released official updates for all affected products. Administrators are urged to transition to the updated versions listed below immediately to protect their systems.

Affected Product	Vulnerable Versions	Updated Version
Megatron-LM	All versions prior to 0.15.3	0.15.3
Triton Inference Server	All versions prior to 26.01	26.01
Model Optimizer	All versions prior to 0.41.0	0.41.0
NeMo Framework	All versions prior to 2.6.2	2.6.2

Support Our Threat Intelligence

If you find our CVE report and cybersecurity news helpful, consider supporting our work.

Buy Me a Coffee PayPal

Written by

@DdoS · Security Researcher

Do Son

Do Son is the Founder and Editor of SecurityOnline.info. Working in cybersecurity since 2013, he reports on vulnerabilities, malware, and emerging threats, providing timely analysis to help organizations and individuals stay ahead of evolving risks.

Tags: AI security CVE-2025-33247 CVE-2026-24157 Denial of Service Deserialization Vulnerability machine-learning Megatron-LM Model Optimizer NeMo Framework nvidia rce Triton Inference Server

Get Zero-Hour Vulnerability Alerts

Support Our Threat Intelligence

Do Son

Leave a Reply Cancel reply