Hello everyone, I’m Haruyuki Tago, Edge Evangelist at HACARUS Tokyo R&D center.
In this series of articles, I will share some insights from my decades of experience in the semiconductor industry and I will comment on various AI industry-related topics from my unique perspective.
In today’s article, I am excited to share information about chiplets and how they have been changing the semiconductor industry since 2011. Chiplets have been instrumental in breaking through many of the traditional limitations surrounding chip manufacturing.
Examples of Devices using Chiplets
Let’s begin our chiplet journey by looking at two microprocessors developed by AMD and Intel. Figure 1 shows the interior and exterior designs for AMD’s EPYC microprocessor.
The second example, shown in Figure 2, is the Agilex FPGA developed by Intel. This device is special because it is able to integrate multiple chips with various functions into one package. This process is done using Intel’s EMIB technology. While each company has its own unique product name, in this article I will reference all multi-chip semiconductor products as either 3D-IC or 2.5D-IC.
The Origins of 3D IC-Products
The idea of stacking semiconductor chips together isn’t new. In 2011, Xilinx released the Virtex-7 2000T using 3D-IC technology. As shown by the news clippings in Figure 3, this commercial release was the beginning of the first wave of 3D-IC technology.
The Virtex-7 was developed to cope with the high levels of network traffic growth. Network traffic was growing by over 34% each year, which was beginning to cause bandwidth issues. For Xilinx, this was a problem since their biggest customers needed high-functioning internet and cellphone base station equipment. These customers created an enormous demand for large-scale FPGAs.
Another issue affecting the industry was that semiconductor capacity couldn’t keep up with the increase in bandwidth and processing requirements. By the way, it is now widely known that AI processing (deep learning) requires high computing power and this feature is the highlight of many new FPGA announcements. Deep learning, however, came into the limelight in 2012, so the 2011 Xilinx FPGA announcement didn’t mention it.
Next, let’s talk about the origins of the 3D-IC design by looking at figure 4. On the left is an image of the vertical stacking for a traditional 3D-IC design. This was an industry standard and was thought to be the most efficient method available at the time. On the right, we can see a modern design for chip stacking that illustrates how to create a large-scale product that exceeds the law of semiconductor chip size (Moore’s Law).
Moving on to some of the technical challenges that are posed by 3D-IC, let’s look at Figure 5. The first problem is that it is difficult to open many TSVs (Through Silicon Via) on the lower layer chip. It is also difficult to produce fine bumps on them to directly bond the lower layer to an upper layer. Second, it is difficult to expel the heat generated by the chips because they are overlapped. The third challenge is that there is no place to escape mechanical stress caused by thermal expansion because the chips are tightly bound together.
To overcome these challenges, a new method was devised in which chip-to-chip connections and wiring to external terminals are made using a silicon interposer with embedded wiring. The name 2.5D-IC may have been coined because the conventional method of laying out ICs on a printed circuit board (PCB) is considered to be 2D, but the interconnections are much shorter and more densely packed.
Figure 6 shows a cross-sectional schematic of the Xilinx Virtex-7 package. Here, the chips (28nm FPGA Slive) are not stacked but instead are laid out horizontally on a silicon interposer. The interposer is a thin silicon substrate with embedded wiring, which serves as the connection between the chips and the wiring to the external pins. The interposer is made of the same silicon material as the chip, so the thermal stress is minimized since they have the same thermal expansion coefficient.
One of the main advantages of the 2.5D-IC is the power consumption shown in Figure 6. The Vertex-7 2000T, using 2.5D-IC technology, is able to reduce power consumption from 112W using the conventional method of 4 FPGAs on a printed circuit board down to 19W.
This 2.5D package has also been used in the A64FX CPU of the supercomputer Fugaku, the world’s highest-performing supercomputer. Figure 7 shows the inside of the package’s metal lid. The CPU chip is in the center and the 4 HBM2 memories are positioned on the right and left. A silicon interposer is used to connect the CPU to the HBM2 memory over the shortest possible distance. This results in high memory bandwidth and low latency. In cases like this, 2.5D-ICs are gradually expanding in high-end areas, where high performance is essential but cost constraints are relatively loose.
The Importance of Chiplet Technology
As technology continues to become more advanced, it becomes increasingly difficult to imagine a life without it. As the demand for computers, GPUs, and FPGAs increases, so will the workloads of both businesses and consumers. All of these machines require high computing power to function. Figure 8 gives a prediction of future market growth in this sector, specifically looking at AI processing.
An Increase in Semiconductor Manufacturing Costs Due to Miniaturization
I have mentioned above that 2.5D-IC technology has been commercially developed as far back as 2011. Since then, the number of applications has been rapidly increasing thanks to the introduction of the 7nm semiconductor manufacturing process. The 7nm process is revolutionary, but it also comes with a high price tag. One negative side effect of this process is the rapid increase in chip manufacturing costs due to miniaturization . Figure 9 does a good job of showing the relationship between manufacturing costs and the process size.
Let’s take a closer look at Figure 9 to understand what exactly is happening. For this data, the cost per good when manufacturing a chip with an area of 250 square nm was looked at. As the process size decreases we can see that the cost only gradually increases until the size approaches the 16/10nm rule. After this point, the price jumps up to 3.7 at 7nm and 4.9 at 5nm.
Next, If we compare the cost ratios for making a good chip, the 7nm process is 1.68 times (3.7/2/2) greater than the 16nm process. This large increase is most likely due to the technological requirements needed to produce a smaller chip. EUV lithography equipment required for the 7nm process can cost more than ten billion yen per unit. While the chip area can be reduced by approximately 43% using this process, the increase in cost makes it difficult for manufacturers.
Another issue is the current trend of increasing the core count of CPUs in order to achieve higher processing performance. Of course, requiring a larger number of cores within a smaller chip will be very costly.
2.5D-IC and Chiplet technology
One solution to the miniaturization problems mentioned above is chiplet and 2.5D-IC technology. Figure 10 gives a hypothetical monolithic example where the cost of producing a 32-core PCU chip is 1.0x. However, if 4 smaller 8-core PCU chiplets are produced instead the cost will drop to 0.59x.
We can take this one step further and look at the real-world case of AMD and its EPYC microprocessor. When manufacturing chips, there are three common types of applications outlined in Figure 11. The first type is an 8-core CPU chip that is designed and manufactured using the most advanced 7nm process available. The second type of chip is manufactured using a server IO die and the third type uses a client IO die.
The latter two chips are made using a slower process that mains the connection compatibility between peripheral circuits (operating voltage, etc.). The operating frequency can also be lower compared to CPU cores which reduces the manufacturing costs. Of course, the package substrate and interposer need to be designed and manufactured for each product, but this is much faster and less expensive than chip development.
The History of 2.5D & 3D-IC Technology
The history of 2.5D and 3D IC product development is quite fascinating. Looking at Figure 12, we can learn a bit more about these products’ history. Grouped by product type, (A) has the longest history of 2.5D-IC, which TSMC calls CoWos. It uses a silicon interposer and has high performance but high cost due to the silicon interposer, which requires expensive TSV equipment and low throughput for drilling holes.
Based on (A), the development was divided into two main directions. One was to reduce the amount of interposer used and to lower the cost, resulting in (C) and (D). The iPhone’s SoC “A10” and later packages are similar to (B), but different, due to size constraints.
In mass production of 2.5D-ICs, it is necessary to increase the yield of the final product after assembly and to make it easier to isolate defects when they occur. For example, in Figure 12 (A), it is common nowadays to divide the responsibility between Company A for manufacturing logic chips (red), Company B for manufacturing DRAM chips (blue), and Company C for assembly (packaging).
It is also necessary to find out where the cause of the failure is in the assembly process of Company A, Company B, and Company C.This scenario leads to many questions involving chip mounting and wiring order. Other concerns include methods for testing the operation during assembly. Innovative thoughts like these have led to the expansion of 2.5D-IC products.
IC Technology Outlook
Figure 13 shows the outlook for 2.5D, 3D-IC, and SoIC technology by TSMC . This table shows three different technologies that are using different methods to connect their chips. The 2.5D column is the current commercialization level where chips are connected using interposers. The 3D-IC column joins the chips together face to face via ubumps. Finally, the SoIC column directly bonds the chips together.
Finally, let me briefly discuss the current state of 3D chiplet technology. At 2021 Computex Taipei, AMD previewed its AMD 3D chiplet technology. This technology demonstrated a 15% increase in frame rate for 3D games  and is scheduled for mass production in 2021. Based on this showing, the SoIC bond methods in Figure 13 may have already been realized.
To finish this article, I want to recap the main points covered today. Thank you for taking the time to read this installment of my web series and I hope you found it interesting.
- The Introduction of 2.3D-IC and 3D-IC technology, which combines multiple smaller chips (chiplets) into a single package has greatly impacted the industry.
- 3D-IC research and development began in 2011 and Xilinx was the first company to commercially launch a product utilizing 2.5D package technology with its Virtex-7 2000T FPGA. This FPGA made use of the silicon-based interposer design instead of chip stacked 3D-IC.
- With the introduction of the 7nm semiconductor process node, the manufacturing cost per chip has been rapidly increasing. In terms of cost and product deployment, it is advantageous to use multiple smaller chiplets in a package using the 2.5D-IC technology instead of conventional large-area single chip methods.
- This technology has been commercialized in the A64FX CPU for the Fugaku supercomputer, AMD’s EPYC processor, and many more. Similar 2.5D-IC technology that does not use silicon interposers has also been commercialized for mobile devices where cost and size are critical. Prototypes of CPUs with improved performance using direct chip bonding technology are also being developed.
- 3D-IC technology really is a hot topic right now that has the potential to largely influence the industry. I will be observing closely as this new technology unfolds in front of our eyes!
 後藤弘茂のWeekly海外ニュース, “ZEN2ベースの64コアCPU「Rome」はなぜCPUとI/Oを分離したのか”
 AMD Infinity Architecture
 小島 郁太郎, 日経クロステック／日経エレクトロニクス, “Intelの10nm FPGAがようやく量産、ノートPC用MPUと同じプロセス
 Ivo Bolesens, CTO Xilinx, “2.5D ICs: Just a Stepping Stone or a Long Term Alternative to 3D?”
 『半導体業界の第一人者，AI業界を行く！』 Vol.7：独自設計CPUで世界一 富岳の秘密
 Amkorの2.5DパッケージとHDFO – アドバンスド ヘテロジニアス パッケージング ソリューション
 Samuel Naffziger, AMD, ISSCC2020, “AMD Chiplet Architecture for High-Performance Server and Desktop Products”
 根津 禎，日経クロステック，いざ7nm世代の製造プロセスへ、TSMCやIBMらが発表
 Don Scansen, 02.26.2021, “AMD TSMC & Imec Show Their Chiplet Playbooks at ISSCC”
 Lisa Su, et.al., AMD at Computex 2021 講演ビデオ