NVIDIA’s RTX Spark looks like a PC chip, but it’s built like a smartphone


NVIDIA RTX Spark SoC

The chip will debut in a wave of premium Windows laptops later this year, with the first designs announced by Microsoft Surface, ASUS, Dell, HP, Lenovo and MSI. RTX Spark systems will range from thin and light 14-inch creator laptops to larger 16-inch workstations and mini desktop PCs, all built around the same unified memory architecture and Blackwell GPU technology.

As someone who has used a Windows PC powered by Snapdragon For some time now, everyday performance and battery life have been exceptional, but the promises of revolutionary on-device AI have not materialized. Running any advanced model is essentially impossible with only 16GB of RAM and no viable accelerator.

The RTX Spark aims to be quite different, with a colossal 128GB of unified system memory coupled with a Blackwell GPU and an Arm-based Grace CPU designed specifically for AI workloads. The price will no doubt be exorbitant in today’s RAM-constrained market, but if your interest is piqued, here’s a lower-tier look at what NVIDIA has included in the RTX Spark.

Mobile-class CPU, only better

SoC chipset processor on your finger

Robert Triggs / Android Authority

Taking a look inside the CPU department reveals a lot about the origin of the superchip, making it a good place to start. The RTX Spark is powered by NVIDIA’s N1X, also known as the GB10 Grace Blackwell superchip. The GB10 already drives the $4,700 DGX Sparkwhich runs NVIDIA’s DGX Linux operating system instead of Windows.

The GB10 uses a modern Armv9 CPU design, the same architecture found in high-end phone chipsets, which should offer solid day-to-day performance. The chip is built with 10 Arm Cortex-X925 cores and 10 A725 cores, for a total of 20 CPU cores. The X925 was launched in 2024 and was found in last year. MediaTek Dimension 9400 for smartphones, albeit in a large single-core configuration. Interestingly, MediaTek helped NVIDIA design the CPU inside the RTX Spark, which helps explain some of the similarities.

At its core, RTX Spark is powered by the same Arm CPU technology as flagship smartphones.

Not only does the RTX Spark have ten powerful cores and ten performance cores (much more than your phone), but it also runs its The GB10 has a similar cache configuration for Dimensity, up to 2 MB L2 for the X925 and 512 KB L2 for the A725, combined with 16 MB L3 and 16 MB of system cache.

It may not match the high-end implementations of Apple Silicon or Qualcomm Oryon in lightly threaded workloads, but its 20-core configuration should still provide substantial CPU performance.

Unified RAM for local AI

Samsung Galaxy S24 Ultra on device AI toggle 1

Lanh Nguyen / Android Authority

Perhaps the most important server-class technology that NVIDIA includes in RTX Spark is the NVLink-C2C interconnect. The memory link provides up to 600 GB/s of bidirectional bandwidth between the CPU and GPU, allowing both to share a unified address space with virtually no overhead.

We again see this shared memory approach in smartphones. Modern smartphone SoCs increasingly rely on large shared caches to efficiently feed CPU, GPU, and AI workloads with data, along with a single LPDDR5X pool shared by apps, games, and on-device AI models like Google’s. Gemini Nano.

The CPU and GPU sharing 128GB of memory are key to fast on-device AI.

NVIDIA notes that its interconnect is about 5 times faster than the bidirectional bandwidth of PCIe Gen5, which can be a notable bottleneck if large AI models must be split between system and GPU RAM. However, NVIDIA’s choice of LPDDR5X RAM has an effective memory bandwidth of 273 GB/s, much slower than the 768 GB/s you’ll find on graphics cards with dedicated GDDR6/7 memory. Therefore, I don’t expect the RTX Spark to offer gaming performance on par with a high-end PC GPU.

Still, NVLink-C2C allows the CPU and GPU to share the large 128GB package-level LPDDR5X memory pool for applications, graphics, and AI workloads that demand extreme memory performance. NVIDIA notes that its 128GB unified memory is enough to accommodate an AI model with 120 billion parameters. GPT-OSS 120B is around 80GB, while NVIDIA Nemotron 3 Super is 83GB. In comparison, Google’s mobile device AI models fit in less than 4 GB of RAMwhich shows how much more memory is needed to go from a pocket AI to a server class one.

A new way of working on laptops

Microsoft Surface 7th generation display

Robert Triggs / Android Authority

Of course, to process those AI workloads, you need a processing unit built specifically for this purpose. This is where the RTX Spark really aims to differentiate itself: it has an integrated Blackwell GPU, the same architecture that powers NVIDIA’s 5000 series gaming GPUs.

The GPU inside the RTX Spark has 6,144 CUDA cores, which matches the GeForce RTX 5070 on paper. However, significantly lower memory bandwidth and a much tighter power envelope mean that gaming performance will likely fall well short of a desktop RTX 5070. Still, it supports DLSS 4.5, Reflex, and hardware ray tracing, providing many of the same feature capabilities found in NVIDIA’s desktop gaming GPUs.

While gaming will be possible, this GPU is designed to bring the CUDA and TensorRT AI ecosystem into the hands of everyday users. NVIDIA claims up to 1 petaflop of AI performance in the FP4, with the goal of running large quantized models directly from the 128GB unified memory on those CUDA cores. For very large models that push the memory limits of conventional GPUs, the RTX Spark’s 128GB unified memory will be more practical than relying on a faster GPU with just 16GB or 32GB of VRAM.

NVIDIA follows the same path as Apple Silicon: large unified memory, Arm CPU, and a tightly integrated GPU.

In many ways, RTX Spark represents the convergence of two computing worlds. Its efficient yet powerful Arm CPU architecture, unified memory design, and low-power packaging draw heavily on ideas that have already transformed smartphones and Apple Silicon Macs. However, NVIDIA combines those concepts with a Blackwell GPU, CUDA acceleration, and an unusually large memory pool intended for local AI inference and server-level workloads.

The success of the shift to AI workstations will depend on price. While we don’t know how much the first wave of laptops launching this fall will cost, the existing DGX Linux desktop version suggests prices will be very high. Still, the platform looks promising for that small but growing sector of Windows users eager to run their own powerful AI workloads.

I don’t want to miss the best of Android Authority?

Google@2x Preferred Font Badge LightDark Google Preferred Font Badge@2x

Thank you for being part of our community. Read our Comment Policy before publishing.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *