Skip to Content

NVIDIA and Google Cloud Expand AI Factory Stack With Rubin Systems and Confidential Blackwell Compute

The partners outlined a deeper infrastructure roadmap for agentic AI and industrial workloads at Cloud Next 2026.

NVIDIA and Google Cloud push the next phase of AI infrastructure

NVIDIA and Google Cloud used this week’s Cloud Next announcements to show how quickly enterprise AI infrastructure is shifting from generic cloud capacity to purpose-built “AI factory” design. In their latest update, the companies described new architecture choices aimed at making both agentic AI software and physical AI systems more practical to run at production scale.

The headline announcement is Google Cloud’s upcoming A5X instance family based on NVIDIA Vera Rubin systems. According to the companies, this setup is designed for significantly better inference economics and stronger throughput per unit of power than the prior generation. That matters because AI spending is increasingly constrained not just by chip availability, but by power budgets, networking limits, and operational complexity.

The partnership update also highlighted how far the stack has moved beyond a single model-hosting layer. Google discussed larger cluster scaling targets, while NVIDIA emphasized integrated networking and software optimization. Together, that points to a market where competitive advantage comes from end-to-end systems engineering rather than isolated hardware specifications.

A second key theme is secure deployment flexibility. The companies previewed support for Gemini deployments in Google Distributed Cloud environments running on NVIDIA Blackwell and Blackwell Ultra GPUs, alongside confidential VM options for Blackwell. For regulated industries, this pairing of high-performance AI with stricter security boundaries could make advanced model usage easier to approve internally.

They also connected the infrastructure side to agent development workflows through Gemini Enterprise Agent Platform, NVIDIA Nemotron open models, and the NeMo framework. In practical terms, this reduces handoff friction between teams building model-driven workflows and teams responsible for production operations.

Why it matters

This isn’t just another chip-cycle headline. It signals that cloud AI competition is being won at the full-stack level: compute, networking, security controls, and deployment tooling all at once. For enterprises, the takeaway is clear: infrastructure choices made in 2026 will directly shape both AI cost structure and speed of execution for the next several years.

Source: NVIDIA Newsroom

Three AI coding agents leaked secrets through a single prompt injection. One vendor's system card predicted it — What It Means for Tech Leaders in 2026
Three AI coding agents leaked secrets through a single prompt injection. One vendor's system ca: key facts, market context, and what this development means for