As AI models swell to ever-greater scales, governments and large enterprises are placing unprecedented emphasis on data sovereignty and regulatory compliance. At re:Invent 2025, AWS unveiled its new AI Factories service—an offering that deploys AWS’s full AI infrastructure, including the latest NVIDIA accelerated computing platforms and AWS’s own Trainium chips, directly into customers’ on-premises data centers, enabling them to rapidly establish high-performance, compliant, and sovereignty-aligned AI compute environments.
AWS noted that for regulated industries and the public sector, building vast AI infrastructure in-house demands enormous capital expenditure and lengthy procurement cycles. The core philosophy behind AI Factories is to move AWS’s complete AI stack—spanning high-speed networking, storage, security, and services such as Bedrock and SageMaker—straight into the customer’s facility, operated entirely by AWS.
The result functions effectively as a Private AWS Region: customers leverage their existing power and space while gaining access to AWS-managed services and model catalogs, eliminating the need to negotiate licensing with multiple vendors. This dramatically shortens deployment timelines and ensures adherence to data-localization requirements.
On the hardware side, AWS is deepening its partnership with NVIDIA. AI Factories will integrate NVIDIA’s full-stack AI software and accelerated computing platforms, including support for the latest NVIDIA Grace Blackwell architecture and the forthcoming NVIDIA Vera Rubin platform.
Furthermore, AWS confirmed that its next-generation Trainium4 chips will support NVIDIA NVLink Fusion, meaning that—following Qualcomm, MediaTek, and Intel—AWS will also join the NVLink Fusion ecosystem. This enables tighter interoperability between Trainium, Graviton processors, and NVIDIA GPUs within shared architectures, granting customers far greater flexibility in designing heterogeneous AI acceleration systems.
Ian Buck, NVIDIA’s Vice President of Hyperscale and HPC, emphasized that large-scale AI compute demands an “end-to-end approach,” and that this collaboration allows AWS to deliver immense computational power directly into customer environments, freeing organizations to focus on innovation rather than integration.
Alongside the AI Factories launch, AWS introduced the Amazon EC2 P6e-GB300 UltraServers, featuring the NVIDIA GB300 NVL72 system, purpose-built for massive inference workloads capable of powering trillion-parameter reasoning models in production environments. Powered by the AWS Nitro System, these instances integrate seamlessly with Amazon EKS and other AWS services.
The first deployment of P6e-GB300 UltraServers will serve HUMAIN, backed by Saudi Arabia’s Public Investment Fund, which plans to establish the country’s first “AI Zone.” The zone will deploy up to 150,000 AI chips, including NVIDIA GB300 GPUs, all powered by AWS’s AI Factories infrastructure to support rapidly growing regional and global AI compute demands.
In addition to its work with AWS and NVIDIA, HUMAIN has also partnered with AMD and Qualcomm for AI chip supply, and announced at Snapdragon Summit 2025 the development of an AI PC in collaboration with Qualcomm—intended to advance user-centric agentic AI, while also leveraging Qualcomm’s AI acceleration platform to deploy large-scale inference infrastructure across Saudi Arabia.
Alongside the new P6e-GB300 UltraServers, AWS will continue offering P6e-GB200 UltraServers built on the GB200 NV72 system, as well as the existing Amazon EC2 P6 instances powered by the B300 and B200 platforms.
Related Posts:
- EU’s Open-Source Age Verification App Sparks Controversy with Google Play Integrity Dependency
- AWS Unleashes Trainium3 Chip: 4.4x Faster AI Performance for EC2 UltraServers
- CISA Flags Two Actively Exploited Vulnerabilities: TP-Link Router Reset Flaw and WhatsApp Zero-Day Chain
- Trend Micro Fortifies AI Security: Integrates NVIDIA Agentic AI Safety for End-to-End Protection