As downloads of open-source AI models and frameworks are expected to surge explosively in 2026, NVIDIA announced a major update to its desktop AI development platform, DGX Spark, at CES 2026. This update delivers performance gains of up to 2.5× through software optimization alone, while also introducing a new suite of tools and workflows tailored for Agentic AI development. Notably, DGX Spark can now operate in tandem with the latest RTX 5090 GPU, significantly accelerating 3D content creation pipelines.
NVIDIA emphasized that since the launch of DGX Spark, continuous tuning in collaboration with the open-source community, along with updates to the software stack, has resulted in substantial performance improvements.
According to official benchmarks, performance on the Qwen-235B model has increased by more than 2.5× compared to the initial release, thanks to the latest version of TRT-LLM and the NVFP4 quantization technique. Stable Diffusion 3.5 Large and PyTorch fine-tuning workloads have also seen gains exceeding 2×. This means developers can achieve markedly faster inference and training speeds on the same hardware. To further lower the barrier to entry, NVIDIA has released seven new Playbooks—practical development guides covering everything from inference and fine-tuning to data science:
- Inference: Expanded support for VLLM, SGLang, and TRT-LLM, along with speculative decoding.
- Fine-Tuning: A major highlight of this release. DGX Spark now supports PyTorch fine-tuning across two linked DGX Spark systems, which is particularly valuable for memory-intensive workloads such as FLUX.1 Dreambooth LoRA and LLAMA Factory fine-tuning.
- Tools: Addressing one of developers’ most persistent challenges—CUDA programming—NVIDIA introduced Nsight Copilot, an AI assistant that runs offline directly on DGX Spark devices. It helps generate CUDA kernel code (such as FP4 matrix multiplication) while ensuring sensitive data never leaves the device. DGX Spark is no longer just a standalone workstation; it can now also function as a powerful external accelerator.
- MacBook Pro Acceleration: Connected over a local network, DGX Spark can boost AI video generation on MacBook Pro systems (M4 Max and above) by up to 8×. In ComfyUI, generating 4K video with FLUX.1 and WAN 2.2 models previously took eight minutes—now it takes just one.
- RTX 5090 Collaboration: For game mod creators, NVIDIA demonstrated a collaborative workflow combining RTX 5090 and DGX Spark. Creators can handle mod editing on the RTX 5090 while offloading time-consuming texture generation tasks to DGX Spark, enabling a seamless and uninterrupted RTX Remix creative experience.
In the realm of Physical AI, NVIDIA also announced a partnership with Hugging Face to power the open-source robot Reachy Mini using DGX Spark. Developers can leverage DGX Spark’s computational capabilities to build AI agents and directly control this compact robot, designed specifically for human–machine interaction experiments.
In addition, the NVIDIA AI Enterprise software suite will officially add support for DGX Spark by the end of January. This unlocks a wide range of edge-AI scenarios, including quality inspection in smart manufacturing, loss-prevention analytics in retail, and real-time analysis at the point of care in healthcare settings. In my view, as AI models continue to grow larger, the rising cost and privacy concerns of cloud-based inference are becoming increasingly apparent. This DGX Spark update clearly reinforces NVIDIA’s strategic moat in on-premises AI development.
By enabling dual-system fine-tuning and on-device execution of Nsight Copilot, NVIDIA directly addresses enterprises’ core concern: keeping data in-house. Meanwhile, cross-platform collaboration with RTX 5090 GPUs and MacBooks elevates DGX Spark from a pure compute platform into an indispensable “AI power add-on” on the desks of creators and developers alike. With OEM partners such as ASUS, Dell, HP, and Lenovo joining the ecosystem to launch DGX Spark–based designs, 2026 is poised to deliver a wave of desktop-class AI solutions built on this architecture.