Scan to Download Gate App
qrCode
More Download Options
Don't remind me again today

Technical differences between NVIDIA GPU and Google, Amazon AWS self-developed AI chips, and future market trends

In the current wave of generative AI sweeping the globe, the core driving force behind this innovation is a series of high-performance artificial intelligence chips. Over the past decade, NVIDIA has sown the seeds of the AI industrial revolution with its GPUs. Today, the Blackwell GPU is specifically designed for the most advanced AI training and inference, becoming the standard equipment in global data centers, with a shipment volume reaching 6 million units last year. In large server racks, 72 GPUs can be integrated into a computing unit resembling a single giant GPU through NVLink technology. Nowadays, the AI chip market is no longer dominated solely by NVIDIA GPUs; customized ASICs and FPGAs are being adopted by major tech companies. What are the differences between these AI chips? How will they affect the development of AI in the future, and could they potentially shake NVIDIA's dominant position? This article is an excerpt from a CNBC video highlights.

GPU: The Beginning of the Golden Age of AI

GPUs have evolved from gaming cards to AI cores, traceable back to AlexNet in 2012. The research team was the first to utilize the parallel computing power of NVIDIA GPUs for neural network training, achieving a significant lead over other competitors in image recognition competitions, thus ushering in the era of deep learning.

The core advantage of GPUs comes from their thousands of parallel processing cores, which can efficiently perform tensor operations such as matrix multiplication, making them ideal for AI training and inference. Nowadays, NVIDIA not only supplies GPUs to OpenAI, various governments, and enterprises, but also directly builds complete server systems. A single rack of Blackwell servers costs as much as 3 million dollars, and NVIDIA even revealed that it ships 1,000 units per week, highlighting the fervent demand for AI computing power. NVIDIA's competitor AMD relies on Instinct GPUs and an open-source software ecosystem to accelerate its progress, and it has recently gained support from OpenAI and Oracle, becoming an important driver in the AI infrastructure market. The distinction of AMD GPUs lies in their primary use of open-source software, while NVIDIA GPUs are closely optimized around CUDA, which is NVIDIA's proprietary software platform.

ASICs designed for a single purpose have become a new trend.

From Google, Amazon, Meta, Microsoft, to OpenAI and Broadcom, major cloud giants are investing in the research and development of customized ASICs (Application-Specific Integrated Circuits). These chips, designed for a single purpose, are expected to become the fastest-growing category of AI chips in the coming years.

As large language models reach maturity, the demand for inference is rapidly surpassing that for training. The costs, energy consumption, and stability of inference have become pain points for cloud platforms, and this is precisely the main battlefield for ASICs. Unlike versatile GPUs, ASICs are like a “dedicated ultra-precision tool” optimized for hard-coded performance on a single type of AI workload, resulting in faster speeds and lower power consumption. The downside is that they lack flexibility and have extremely high development thresholds, with the design cost of a custom chip often reaching billions of dollars, making it affordable only for cloud giants.

The cost of customized ASICs for AI is higher. Extremely expensive, requiring at least thousands or even hundreds of millions of dollars. However, for large cloud service providers that cannot afford customized ASICs, customized AS6 can bring returns because they are more energy-efficient and reduce dependence on NVIDIA.

Broadcom's ASIC strongly challenges AI market share

Broadcom and chip design companies like Marvell are core strategic partners of super-large cloud enterprises. Google TPU, Meta's self-developed accelerators, and the upcoming ASIC from OpenAI are all deeply involved with Broadcom. Broadcom assists in constructing Google's TPU and Meta's AI inference training, with analysts estimating that Broadcom's market share in the customized ASIC market could reach 70% to 80%.

FPGA: A flexible option between ASIC and GPU

FPGA is a chip used to support edge AI on the device side rather than the cloud. The biggest advantage of FPGA lies in its “reconfigurability.” When companies need to test architectures before the hardware is finalized, FPGA provides an option that balances the versatility of GPUs with the high performance of ASICs. Although its performance is not as good as ASICs, its flexibility makes it still favored by data centers and embedded devices. AMD (acquired Xilinx) and Intel (acquired Altera) are the two major players in the FPGA market.

Google TPU

Google is the first major player in ASICs, pioneering the development of custom Application-Specific Integrated Circuits (ASICs) for artificial intelligence acceleration, and coining the term Tensor Processing Unit (TPU) when its first ASIC was launched in 2015. The TPU also contributed to Google inventing the Transformer architecture in 2017, becoming a foundational element for AI like ChatGPT and Claude. Today, Google has advanced to the 7th generation TPU Ironwood, assisting Anthropologie in training Claude series models with millions of TPUs. There are rumors that the TPU outperforms NVIDIA's GPUs in certain situations, but traditionally, Google has only used them internally, so the true potential of the TPU has yet to be fully realized.

AWS Tranium: Cloud Inference Matrix

After acquiring Annapurna Labs, AWS is fully committed to its own AI chips. Tranium and Inferentia have become important pillars of the AWS training and inference platform. Tranium is composed of a large number of small tensor engines, highly flexible, and according to AWS, offers cost performance that is 30% to 40% higher than other hardware in the cloud. In 2024, Anthropic trained models with 500,000 Tranium 2 chips at AWS's North Indiana data center, where not even a single NVIDIA GPU was used, indicating the rising status of ASICs.

NPU (Neural Processing Unit): Edge AI chips for mobile phones, computers, and automotive devices.

In addition to data centers, AI chips are also extending to personal devices. NPU (Neural Processing Unit) is a chip designed to run edge artificial intelligence on devices rather than in the cloud, ensuring personal privacy. It is now integrated into Qualcomm Snapdragon, AMD, Intel, and Apple's M series SoCs, used in smartphones, laptops, smart homes, cars, and even robots. AI on the device side will bring higher privacy protection, lower latency, and stronger control, making it an important driving force for the next wave of AI proliferation.

TSMC has become the core of the chip competition.

Whether it's Nvidia's Blackwell, Google TPU, or AWS Tranium, most AI chips are ultimately manufactured by TSMC. This tightly binds the supply of AI computing power to global geopolitics. The United States is attempting to bring some chip manufacturing capacity back to the homeland through TSMC's Arizona plant and Intel's 18A process. However, Chinese companies such as Huawei and Alibaba are also actively developing their own ASICs, seeking domestic alternatives under export controls.

The era of AI chip dominance has arrived.

Whether it's the strong dominance of Nvidia GPUs or the ASIC tracks and NPUs of companies like Google, AWS, Meta, and OpenAI pushing edge AI towards every smartphone and car, the chip war is still accelerating. Although it's not easy to shake Nvidia's position, the AI market is vast, and new players are constantly emerging, making the chip landscape in the next decade undoubtedly more intense.

This article discusses the technical differences between NVIDIA GPUs and self-developed AI chips from Google and Amazon AWS, as well as future market trends. It first appeared in Chain News ABMedia.

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
0/400
No comments
  • Pin
Trade Crypto Anywhere Anytime
qrCode
Scan to download Gate App
Community
  • 简体中文
  • English
  • Tiếng Việt
  • 繁體中文
  • Español
  • Русский
  • Français (Afrique)
  • Português (Portugal)
  • Bahasa Indonesia
  • 日本語
  • بالعربية
  • Українська
  • Português (Brasil)