Edge AI Evangelist’s Thoughts Vol. 11: An Emerging Trend in Semiconductor Industry – Chiplets

Edge AI Evangelist’s Thoughts Vol. 11: An Emerging Trend In Semiconductor Industry – Chiplets

Hello everyone, I’m Haruyuki Tago, Edge Evangelist at HACARUS Tokyo R&D center.

In this series of articles, I will share some insights from my decades of experience in the semiconductor industry and I will comment on various AI industry-related topics from my unique perspective.

In today’s article, I am excited to share information about chiplets and how they have been changing the semiconductor industry since 2011. Chiplets have been instrumental in breaking through many of the traditional limitations surrounding chip manufacturing.

 

Examples of Devices using Chiplets

Let’s begin our chiplet journey by looking at two microprocessors developed by AMD and Intel. Figure 1 shows the interior and exterior designs for AMD’s EPYC microprocessor.

 

A display showing the AMD EPYC microprocessor and its inner architecture on the right.

Figure 1. AMD EPYC Microprocessor Layout

 

The second example, shown in Figure 2, is the Agilex FPGA developed by Intel. This device is special because it is able to integrate multiple chips with various functions into one package. This process is done using Intel’s EMIB technology. While each company has its own unique product name, in this article I will reference all multi-chip semiconductor products as either 3D-IC or 2.5D-IC. 

 

An architecture layout diagram and explanation of the new chiplet-based architecture and its advantages.

Figure 2. Intel Agilex FPGA [3]

The Origins of 3D IC-Products

The idea of stacking semiconductor chips together isn’t new. In 2011, Xilinx released the Virtex-7 2000T using 3D-IC technology. As shown by the news clippings in Figure 3, this commercial release was the beginning of the first wave of 3D-IC technology.

 

A collection of various news articles talking about the first wave of 3D-IC technology and market data.

Figure 3. Xilinx New 3D-IC News & Large-Scale FPGA Requirements

 

The Virtex-7 was developed to cope with the high levels of network traffic growth. Network traffic was growing by over 34% each year, which was beginning to cause bandwidth issues. For Xilinx, this was a problem since their biggest customers needed high-functioning internet and cellphone base station equipment. These customers created an enormous demand for large-scale FPGAs.

Another issue affecting the industry was that semiconductor capacity couldn’t keep up with the increase in bandwidth and processing requirements. By the way, it is now widely known that AI processing (deep learning) requires high computing power and this feature is the highlight of many new FPGA announcements. Deep learning, however, came into the limelight in 2012, so the 2011 Xilinx FPGA announcement didn’t mention it.

Next, let’s talk about the origins of the 3D-IC design by looking at figure 4. On the left is an image of the vertical stacking for a traditional 3D-IC design. This was an industry standard and was thought to be the most efficient method available at the time. On the right, we can see a modern design for chip stacking that illustrates how to create a large-scale product that exceeds the law of semiconductor chip size (Moore’s Law).

 

A listing of the advantages of using 3D-IC technology and how it can increase capacity beyond what is predicted by Moore's Law.

Figure 4. Why 3D-IC Technology Should be Used

 

Moving on to some of the technical challenges that are posed by 3D-IC, let’s look at Figure 5. The first problem is that it is difficult to open many TSVs (Through Silicon Via) on the lower layer chip. It is also difficult to produce fine bumps on them to directly bond the lower layer to an upper layer. Second, it is difficult to expel the heat generated by the chips because they are overlapped. The third challenge is that there is no place to escape mechanical stress caused by thermal expansion because the chips are tightly bound together.

 

Picture representations for the technical challenges posed by 3D processes and a comparison of 3D and 2.5D technology in table format.

Figure 5. Problems of Stacked 3D-IC and Comparison with 2.5D-IC [4]

To overcome these challenges, a new method was devised in which chip-to-chip connections and wiring to external terminals are made using a silicon interposer with embedded wiring. The name 2.5D-IC may have been coined because the conventional method of laying out ICs on a printed circuit board (PCB) is considered to be 2D, but the interconnections are much shorter and more densely packed.

Figure 6 shows a cross-sectional schematic of the Xilinx Virtex-7 package. Here, the chips (28nm FPGA Slive) are not stacked but instead are laid out horizontally on a silicon interposer. The interposer is a thin silicon substrate with embedded wiring, which serves as the connection between the chips and the wiring to the external pins. The interposer is made of the same silicon material as the chip, so the thermal stress is minimized since they have the same thermal expansion coefficient. 

 

On the left, there is a diagram showing the different components of 2.5D-IC technology. On the right, is a comparison between monolithic FPGAs and the Virtex-7 2000T.

Figure 6. Xilinx Virtex-7 2000T Comparison for Power Consumption

 

One of the main advantages of the 2.5D-IC is the power consumption shown in Figure 6. The Vertex-7 2000T, using 2.5D-IC technology, is able to reduce power consumption from 112W using the conventional method of 4 FPGAs on a printed circuit board down to 19W.

This 2.5D package has also been used in the A64FX CPU of the supercomputer Fugaku, the world’s highest-performing supercomputer. Figure 7 shows the inside of the package’s metal lid. The CPU chip is in the center and the 4 HBM2 memories are positioned on the right and left. A silicon interposer is used to connect the CPU to the HBM2 memory over the shortest possible distance. This results in high memory bandwidth and low latency. In cases like this, 2.5D-ICs are gradually expanding in high-end areas, where high performance is essential but cost constraints are relatively loose. 

 

The elft shows a picture of the A64FX's 2.5D dimension package and the right image shows the configuration of a 2.5D package.

Figure 7. Interior Image of the Fugaku Supercomputer CPU A64FX (Left), and Cross-Sectional Schematic Diagram (Right)

 

The Importance of Chiplet Technology

As technology continues to become more advanced, it becomes increasingly difficult to imagine a life without it. As the demand for computers, GPUs, and FPGAs increases, so will the workloads of both businesses and consumers. All of these machines require high computing power to function. Figure 8 gives a prediction of future market growth in this sector, specifically looking at AI processing. 

 

A graphical depiction of the AI compute hardware TAM by year and the distribution of hardware type by year.

Figure 8 Global AI Computer Hardware Market Size (TAM) Source: Goldman Sachs 2018 [6]

An Increase in Semiconductor Manufacturing Costs Due to Miniaturization 

I have mentioned above that 2.5D-IC technology has been commercially developed as far back as 2011. Since then, the number of applications has been rapidly increasing thanks to the introduction of the 7nm semiconductor manufacturing process. The 7nm process is revolutionary, but it also comes with a high price tag. One negative side effect of this process is the rapid increase in chip manufacturing costs due to miniaturization [7]. Figure 9 does a good job of showing the relationship between manufacturing costs and the process size. 

 

A graphical representation of efficiency gains for 7nm computing. The x axis represents process size and the y axis represents the normalized cost.

Figure 9. Trends in Manufacturing Cost per Chip [6]

Let’s take a closer look at Figure 9 to understand what exactly is happening. For this data, the cost per good when manufacturing a chip with an area of 250 square nm was looked at. As the process size decreases we can see that the cost only gradually increases until the size approaches the 16/10nm rule. After this point, the price jumps up to 3.7 at 7nm and 4.9 at 5nm. 

Next, If we compare the cost ratios for making a good chip, the 7nm process is 1.68 times (3.7/2/2) greater than the 16nm process. This large increase is most likely due to the technological requirements needed to produce a smaller chip. EUV lithography equipment required for the 7nm process can cost more than ten billion yen per unit. While the chip area can be reduced by approximately 43% using this process, the increase in cost makes it difficult for manufacturers. 

Another issue is the current trend of increasing the core count of CPUs in order to achieve higher processing performance. Of course, requiring a larger number of cores within a smaller chip will be very costly.

 

2.5D-IC and Chiplet technology

One solution to the miniaturization problems mentioned above is chiplet and 2.5D-IC technology. Figure 10 gives a hypothetical monolithic example where the cost of producing a 32-core PCU chip is 1.0x. However, if 4 smaller 8-core PCU chiplets are produced instead the cost will drop to 0.59x.

 

The image is comparing the manufacturing advantages of chiplets over monolithic dies because they are more flexible, cost less, and have a higher performance.

figure 10. Differences Between Monolithic, Chiplets, and Chip Manufacturing Costs [1]

We can take this one step further and look at the real-world case of AMD and its EPYC microprocessor. When manufacturing chips, there are three common types of applications outlined in Figure 11. The first type is an 8-core CPU chip that is designed and manufactured using the most advanced 7nm process available. The second type of chip is manufactured using a server IO die and the third type uses a client IO die. 

The latter two chips are made using a slower process that mains the connection compatibility between peripheral circuits (operating voltage, etc.). The operating frequency can also be lower compared to CPU cores which reduces the manufacturing costs. Of course, the package substrate and interposer need to be designed and manufactured for each product, but this is much faster and less expensive than chip development.

 

The image shows the different types of chips that can be produced for a variety of uses including PCs, servers, and premium devices.

Figure 11 Using Chiplets to Increase Product Variety [7].

The History of 2.5D & 3D-IC Technology

The history of 2.5D and 3D IC product development is quite fascinating. Looking at Figure 12, we can learn a bit more about these products’ history. Grouped by product type, (A) has the longest history of 2.5D-IC, which TSMC calls CoWos. It uses a silicon interposer and has high performance but high cost due to the silicon interposer, which requires expensive TSV equipment and low throughput for drilling holes.

Based on (A), the development was divided into two main directions. One was to reduce the amount of interposer used and to lower the cost, resulting in (C) and (D). The iPhone’s SoC “A10” and later packages are similar to (B), but different, due to size constraints.

 

A visual representation of different products from 2010 to 2021.

Figure 12. History of 2.5D-IC / 3D-IC

 

In mass production of 2.5D-ICs, it is necessary to increase the yield of the final product after assembly and to make it easier to isolate defects when they occur. For example, in Figure 12 (A), it is common nowadays to divide the responsibility between Company A for manufacturing logic chips (red), Company B for manufacturing DRAM chips (blue), and Company C for assembly (packaging). 

It is also necessary to find out where the cause of the failure is in the assembly process of Company A, Company B, and Company C.This scenario leads to many questions involving chip mounting and wiring order. Other concerns include methods for testing the operation during assembly. Innovative thoughts like these have led to the expansion of 2.5D-IC products.

 

IC Technology Outlook

Figure 13 shows the outlook for 2.5D, 3D-IC, and SoIC technology by TSMC [9]. This table shows three different technologies that are using different methods to connect their chips. The 2.5D column is the current commercialization level where chips are connected using interposers. The 3D-IC column joins the chips together face to face via ubumps. Finally, the SoIC column directly bonds the chips together. 

 

A table showing the comparisons of various IC technologies for a wide range of categories.

Figure 13. TSMC SoIC provides better interconnect performance for 3D integration (source: ISSCC 2021) [9]

Finally, let me briefly discuss the current state of 3D chiplet technology. At 2021 Computex Taipei, AMD previewed its AMD 3D chiplet technology. This technology demonstrated a 15% increase in frame rate for 3D games [10] and is scheduled for mass production in 2021. Based on this showing, the SoIC bond methods in Figure 13 may have already been realized.

 

Summary

To finish this article, I want to recap the main points covered today. Thank you for taking the time to read this installment of my web series and I hope you found it interesting.

  • The Introduction of 2.3D-IC and 3D-IC technology, which combines multiple smaller chips (chiplets) into a single package has greatly impacted the industry.
  • 3D-IC research and development began in 2011 and Xilinx was the first company to commercially launch a product utilizing 2.5D package technology with its Virtex-7 2000T FPGA. This FPGA made use of the silicon-based interposer design instead of chip stacked 3D-IC.
  • With the introduction of the 7nm semiconductor process node, the manufacturing cost per chip has been rapidly increasing. In terms of cost and product deployment, it is advantageous to use multiple smaller chiplets in a package using the 2.5D-IC technology instead of conventional large-area single chip methods.
  • This technology has been commercialized in the A64FX CPU for the Fugaku supercomputer, AMD’s EPYC processor, and many more. Similar 2.5D-IC technology that does not use silicon interposers has also been commercialized for mobile devices where cost and size are critical. Prototypes of CPUs with improved performance using direct chip bonding technology are also being developed.
  • 3D-IC technology really is a hot topic right now that has the potential to largely influence the industry. I will be observing closely as this new technology unfolds in front of our eyes!

 

References

[1] 後藤弘茂のWeekly海外ニュース, “ZEN2ベースの64コアCPU「Rome」はなぜCPUとI/Oを分離したのか”
https://pc.watch.impress.co.jp/docs/column/kaigai/1156455.html

[2] AMD Infinity Architecture
https://www.amd.com/en/technologies/infinity-architecture

[3] 小島 郁太郎, 日経クロステック/日経エレクトロニクス, “Intelの10nm FPGAがようやく量産、ノートPC用MPUと同じプロセス
https://xtech.nikkei.com/atcl/nxt/column/18/01537/00066/

[4] Ivo Bolesens, CTO Xilinx,  “2.5D ICs: Just a Stepping Stone or a Long Term Alternative to 3D?”
https://www.xilinx.com/publications/about/3-D_Architectures.pdf

[5]  『半導体業界の第一人者,AI業界を行く!』 Vol.7:独自設計CPUで世界一 富岳の秘密
 https://hacarus.com/ja/ai-lab/20210331-fugaku/

[6] Amkorの2.5DパッケージとHDFO – アドバンスド ヘテロジニアス パッケージング ソリューション
https://c44f5d406df450f4a66b-1b94a87d576253d9446df0a9ca62e142.ssl.cf2.rackcdn.com/2018/12/Amkor_2.5D_Package_and_HDFO_Technical_Article_JP.pdf

[7] Samuel Naffziger, AMD, ISSCC2020, “AMD Chiplet Architecture for High-Performance Server and Desktop Products”
https://www.slideshare.net/AMD/amd-chiplet-architecture-for-highperformance-server-and-desktop-products

[8] 根津 禎,日経クロステック,いざ7nm世代の製造プロセスへ、TSMCやIBMらが発表
https://xtech.nikkei.com/dm/atcl/event/15/112800090/120700011/

[9] Don Scansen,  02.26.2021, “AMD TSMC & Imec Show Their Chiplet Playbooks at ISSCC”
https://www.eetimes.com/amd-tsmc-imec-show-their-chiplet-playbooks-at-isscc/

[10] Lisa Su, et.al., AMD at Computex 2021 講演ビデオ
https://www.youtube.com/watch?v=gqAYMx34euU

 

Subscribe to our newsletter

Click here to sign up