Nvidia's DGX Station is a desktop supercomputer that runs trillion-parameter AI models without the cloud

NVIDIA on Monday unveiled a desktop supercomputer powerful enough to run artificial intelligence models with up to a trillion parameters (roughly the scale of GPT-4) without touching the cloud. The machine, called DGX StationIt packs 748 gigabytes of coherent memory and 20 petaflops of computing into a box that sits next to a monitor, and may be the most important personal computing product since the original Mac Pro convinced creative professionals to abandon workstations.

The announcement, made at the company’s annual meeting GTC conference in San Jose, comes at a time when the AI industry is grappling with a fundamental tension: The world’s most powerful models require massive data center infrastructure, but developers and the companies that build on those models increasingly want to keep their data, their agents, and their intellectual property local. The DGX Station is Nvidia’s answer: a six-figure machine that bridges the gap between the frontier of AI and a single engineer’s desk.

What 20 petaflops really means on your desktop

He DGX Station It is built around the new GB300 Grace Blackwell Ultra Desktop Superchipwhich fuses a 72-core Grace CPU and a Blackwell Ultra GPU via Nvidia’s NVLink-C2C interconnect. That link provides 1.8 terabytes per second of consistent bandwidth between the two processors (seven times the speed of PCIe Gen 6), meaning the CPU and GPU share a single pool of memory without the bottlenecks that typically grind desktop AI to a halt.

Twenty petaflops (20 quadrillion operations per second) would have ranked this machine among the best supercomputers in the world less than a decade ago. The summit system in Oak Ridge National Laboratorywhich was ranked number one in the world in 2018, had about ten times that performance, but occupied a room the size of two basketball courts. Nvidia is packing a significant fraction of that capacity into something that plugs into a wall outlet.

The 748 GB of unified memory is possibly the most important number. Billion-parameter models are huge neural networks that must be fully loaded into memory to function. Without enough memory, the amount of processing speed doesn’t matter: the model just doesn’t fit. The DGX Station surpasses that bar, and it does so with a consistent architecture that eliminates latency penalties when transferring data between CPU and GPU memory pools.

Always-On Agents Require Always-On Hardware

Nvidia designed the DGX Station explicitly for what it sees as the next phase of AI: autonomous agents that continually reason, plan, write code, and execute tasks, not just systems that respond to prompts. Every important announcement in GTC 2026 reinforced this "AI agent" thesis, and the DGX station is where those agents need to be built and run.

The key pair is NemoClawa new open source stack that Nvidia also announced on Monday. NemoClaw includes Nvidia’s Nemotron open models with OpenShella secure runtime that enforces privacy, network, and policy-based security measures for autonomous agents. A single command installs the entire stack. Jensen Huang, founder and CEO of Nvidia, formulated the combination in no uncertain terms, calling open claw — the broadest agent platform NemoClaw supports — "the operating system for personal AI" and comparing it directly with Mac and Windows.

The argument is simple: cloud instances scale up and down based on demand, but always-on agents need persistent compute, memory, and state. A machine under your desk, running 24/7 with local data and local models within a security sandbox, is better suited architecturally for that workload than a rented GPU in someone else’s data center. The DGX Station can function as a personal supercomputer for a solo developer or as a shared compute node for teams, and supports isolated configurations for classified or regulated environments where data can never leave the building.

From desktop prototype to data center production without rewrites

One of the smartest aspects of the DGX station’s design is what Nvidia calls architectural continuity. Applications built on the machine migrate seamlessly to the company’s GB300 NVL72 data center systems (72-GPU racks designed for hyperscale AI factories) without needing to redesign a single line of code. Nvidia is selling a vertically integrated process: prototype on your desktop and then scale to the cloud when you’re ready.

This is important because the biggest hidden cost in AI development today is not computing, but the engineering time lost rewriting code for different hardware configurations. A model tuned on a local GPU cluster often requires substantial rework to deploy to a cloud infrastructure with different memory architectures, networking stacks, and software dependencies. The DGX Station eliminates that friction by running the same NVIDIA AI software stack that powers every level of Nvidia’s infrastructure, from the DGX Spark to the Vera Rubin NVL72.

Nvidia also expanded the DGX Spark, the Station’s younger brother, with new clustering support. Up to four Spark units can now operate as a unified system with near-linear performance scaling: a "desktop data center" that fits on a conference table without rack infrastructure or IT ticket. For teams that need to tune mid-sized models or develop smaller-scale agents, clustered Sparks offers a credible departmental AI platform at a fraction of the cost of Station.

Early buyers reveal where the market is headed

DGX Station’s initial customer list maps the industries where AI is moving most quickly from an experiment to an everyday operational tool. Snowflake is using the system to locally test its open source training framework in the Arctic. EPRIThe Electric Power Research Institute is advancing artificial intelligence-based weather forecasting to strengthen the reliability of the electrical grid. Medivis is integrating visual language models into surgical workflows. Microsoft Research and Cornell have implemented the systems for hands-on AI training at scale.

The systems are available to order now and will ship in the coming months starting ASUS, Dell Technologies, GIGABYTE, MSIand supermicrowith horsepower joining later in the year. Nvidia hasn’t revealed pricing, but the company’s GB300 components and historical DGX pricing suggest a six-figure investment: expensive by workstation standards, but remarkably cheap compared to the cloud GPU costs of running trillion-parameter inference at scale.

The list of supported models underlines how open the AI ecosystem has become: developers can run and tune OpenAI gpt-oss-120bGoogle gem 3, Qwen3, Mistral Grande 3, Deep Search V3.2and Nvidia’s Nemotron models, among others. The DGX station is model-agnostic by design: Swiss hardware in an industry where model allegiances change quarterly.

Nvidia’s real strategy: own every layer of the AI stack, from orbit to the office

He DGX Station It didn’t come in a vacuum. It was one piece of a large set of GTC 2026 announcements that collectively map Nvidia’s ambition to deliver AI computing at literally every physical scale.

At the top, Nvidia presented the Vera Rubin Platform (seven new chips in full production) anchored in the Vera Rubin NVL72 chassis, which integrates 72 next-generation Rubin GPUs and claims up to 10x higher inference performance per watt compared to Blackwell’s current generation. He real cpuWith 88 custom Olympus cores, it targets the orchestration layer that agent workloads increasingly demand. On the furthest frontier, Nvidia announced the Vera Rubin space module for orbital data centers, which offers 25 times more AI computing for space-based inference than the H100.

Between orbit and the office, Nvidia revealed partnerships spanning Adobe for creative AI, automakers like BYD and Nissan for Level 4 autonomous vehicles, a coalition with Mistral AI and seven other labs to build open frontier models, and Dynamo 1.0, an open source inference operating system already adopted by AWS, Azure, Google Cloud, and a list of native AI companies including Cursor and Perplexity.

The pattern is unmistakable: Nvidia wants to be the computing platform (hardware, software and models) for every AI workload, everywhere. He DGX Station It is the piece that fills the gap between the cloud and the individual.

The cloud isn’t dead, but its monopoly on serious AI work is coming to an end

For the past few years, the default assumption in AI has been that serious work requires GPU instances in the cloud: rent Nvidia hardware from AWS, Azureeither Google cloud. That model works, but it comes with real costs: data egress fees, latency, security exposure when sending proprietary data to third-party infrastructure, and the fundamental loss of control inherent in renting someone else’s computer.

He DGX Station doesn’t kill the cloud: Nvidia’s data center business eclipses its desktop revenue and is accelerating. But it creates a credible on-premise alternative for an important and growing category of workloads. Training a frontier model from scratch still requires thousands of GPUs in a warehouse. Fit an open model with billions of parameters on proprietary data? Running inferences for an internal agent processing sensitive documents? Prototyping before committing to cloud spending? A machine under your desk is starting to look like the rational choice.

This is the strategic elegance of the product: it expands Nvidia’s addressable market into personal AI infrastructure while bolstering the cloud business, because everything built locally is designed to scale to Nvidia’s data center platforms. It’s not about cloud versus desktop. it’s cloud and desktop, and Nvidia supplies both.

A supercomputer on every desk and an agent who never sleeps on it

The motto that defined the PC revolution was "a computer on every desk and in every home." Four decades later, Nvidia is updating the premise with an uncomfortable escalation. He DGX Station places genuine supercomputing power, the kind handled by national laboratories, next to a keyboard, and NemoClaw places on top an autonomous artificial intelligence agent that runs 24 hours a day, writing code, calling tools, and completing tasks while its owner sleeps.

Whether that future is exhilarating or disturbing depends on your point of view. But one thing is no longer debatable: the infrastructure needed to build, run, and own frontier AI has just moved from the server room to the desk drawer. And the company that sells almost every major AI chip on the planet made sure to sell the desk drawer, too.

Source link

Nvidia’s DGX Station is a desktop supercomputer that runs trillion-parameter AI models without the cloud

What 20 petaflops really means on your desktop

Always-On Agents Require Always-On Hardware

From desktop prototype to data center production without rewrites

Early buyers reveal where the market is headed

Nvidia’s real strategy: own every layer of the AI stack, from orbit to the office

The cloud isn’t dead, but its monopoly on serious AI work is coming to an end

A supercomputer on every desk and an agent who never sleeps on it

Leave a ReplyCancel Reply

Microsoft Edge removes pinning of web apps to the sidebar, keeping Copilot intact

Your Windows PC can run any Linux distribution from a USB stick, and it’s the best way to choose one

Samsung’s Try Galaxy app reaches Arabic speakers in the UAE

What 20 petaflops really means on your desktop

Always-On Agents Require Always-On Hardware

From desktop prototype to data center production without rewrites

Early buyers reveal where the market is headed

Nvidia’s real strategy: own every layer of the AI ​​stack, from orbit to the office

The cloud isn’t dead, but its monopoly on serious AI work is coming to an end

A supercomputer on every desk and an agent who never sleeps on it

Leave a ReplyCancel Reply

Trending now

Microsoft Edge removes pinning of web apps to the sidebar, keeping Copilot intact

Your Windows PC can run any Linux distribution from a USB stick, and it’s the best way to choose one

Samsung’s Try Galaxy app reaches Arabic speakers in the UAE

Nvidia’s real strategy: own every layer of the AI stack, from orbit to the office