Agilex is a combination of agile and flexible words, and these two characteristics are the two key points of modern FPGA technology. In 2015, Intel promised to provide different point 5 heterogeneous architectures according to different customer needs, including: discrete CPU + FPGA, packaged integrated CPU + FPGA, and FPGA integrating Intel CPU / FPGA / arm.
The reason is obvious. Through integration, it can not only reduce the delay, improve the performance and performance / W, but also unify the tool flow between the processor and FPGA, and provide more extensive architecture support for different performance requirements. Four years later, agilex FPGA realized the integration of different process technologies and different logic units through heterogeneous architecture, and achieved a breakthrough in flexibility and customization.
According to Intel’s February benchmark, agilex increased its maximum clock rate (Fmax) by 40% compared with Stratix 10, while reducing its total energy consumption by up to 40%. In addition, agilex has DSP performance up to 40 tflop (fp16 configuration) and 92 top DSP performance (int8 configuration). Frankly speaking, agilex FPGAs can’t achieve the above performance indicators just because of the heterogeneous architecture. So, what unknown “black technology” is hidden in agilex FPGAs?
In order to ensure the consistency of performance, the core FPGA logic structure chip of agilex FPGA also adopts the Intel 10 nanometer chip process technology, which is also one of the most advanced FinFET process technologies in the world. At the same time, agilex also integrates Intel’s proprietary embedded multi chip interconnect bridging (emib) integrated 3D heterogeneous system level packaging (SIP) technology, which provides a high-performance, low-cost way to help integrate chips and FPGA logic structure chips into the same package.
The logic structure chip of agilex FPGA adopts the second generation Intel hyperflex architecture. In addition to using extra registers hyper registers in the whole core architecture as the first generation architecture, the second generation architecture also improves the overall structure performance and reduces the power consumption as much as possible. The most significant improvement is to add high-speed bypass in the super registers.
Chips is a kind of physical IP module, which can integrate other chips through package level integration method and standardized interface. With the help of chips, the number of transceivers is no longer limited by the number of channels. In order to increase or reduce the number of transceiver channels, designers only need to add the required transceiver chips, and do not need to rearrange the chip to integrate different numbers of channels. In this case alone, Intel increased the speed of a single transceiver channel from 58gbps to 112gbps.
As the hardware accelerator of CPU in the data center, it is a main application scenario of FPGA to accelerate various applications such as model training, financial calculation, network function unloading, etc. But one of the core problems in this field is cache consistency. In other words, it is necessary to clarify the memory interconnection protocol between CPU and hardware accelerator.
Agilex FPGA supports all levels of memory resources, including embedded memory resources, encapsulated memory and off chip memory provided through a dedicated interface. The first layer of the hierarchical structure is embedded on-chip memory, including mlab, block ram and esram. Each memory can provide different capacity to meet different processing needs. In addition, Intel also uses SIP technology in its design to integrate high bandwidth memory (HBM) directly into agilex FPGA devices, which helps to reduce the size and cost of circuit boards, simplify and reduce power requirements.
Another focus of attention is the integration of easic technology into the agilex platform. This kind of integrated easic chip customization technology can realize the migration from FPGA to structured ASIC. In other words, users can make use of the customized logical continuum of reusable IP provided by easic itself to carry out flexible optimization in the whole product life cycle and quickly transfer from FPGA to ASIC.
For each order of magnitude performance improvement of the new hardware architecture, software can bring two order of magnitude performance improvement correspondingly. On the new generation of agilex FPGAs, supporting software quartus prime can shorten the compilation time of hardware developers by 30%, and improve the memory utilization by 15%. At the same time, the new generation of agilex FPGA is also included in the one API architecture.
The “oneapi” software programming framework, which will be launched in the fourth quarter of this year, provides a single source heterogeneous programming environment for software developers, supports common software development tools such as performance library API, Intel VTune and advisor, and can match software to hardware that can maximize software code acceleration, so as to simplify the application of FPGA, CPU, GPU, artificial intelligence and other accelerators The programming interfaces of various computing engines in the system can reduce the development complexity under various architectures and workloads, and accelerate the large-scale deployment of the six technical pillars.