As a new type of infrastructure in the intelligent world, AI, 5g, intelligent edge and cloud computing will accelerate the development of digital economy and bring great development opportunities for the growth of new business. Nowadays, AI and data analysis open up new opportunities for customers in finance, health care, industry, communication, transportation and other industries. According to IDC’s prediction, by 2021, 75% of all kinds of applications of commercial enterprises will apply artificial intelligence; By 2025, about a quarter of all data will be generated in real time, and 95% of this huge increase will be generated through various Internet of things (IOT) devices.
Artificial intelligence and data analysis are the key workload in the next 10 years. Rapid deployment of AI and data analysis is very important for today’s enterprises. Intel has been committed to continuously strengthening the built-in AI acceleration capability and software optimization advantages of processors, so as to better provide kinetic energy for global data center and edge solutions, and help release unlimited insight in data by creating unparalleled chip cornerstone.
Recently, Intel announced the latest data platform product portfolio, including Intel’s third-generation Xeon scalable processor integrated with AI acceleration, Intel’s first artificial intelligence optimized FPGA Stratix 10 NX, Intel’s second-generation aoteng persistent memory, Intel’s latest 3D NAND SSD and related software solutions, to achieve the goal of data center, data management and data management Cloud and intelligent edge support customers to further accelerate the development and deployment of artificial intelligence, data analysis and other workloads, help the construction of new intelligent infrastructure, and steer the new wave of digital economy.
Chen Baoli, vice president of Intel marketing group and general manager of data center sales in China, said that the data center market is booming, and Intel CPU is the only processor integrated with AI acceleration in the industry. At the same time, Intel has a comprehensive product portfolio to achieve a comprehensive coverage of the data center field.
Intel has built an unparalleled broad product portfolio and ecosystem support for AI and data analysis. The fully optimized new data platform, together with the booming partner ecosystem based on Intel AI technology, is helping all kinds of enterprises actively deploy intelligent AI and data analysis services, so as to turn data into important assets of enterprises.
Black technology of the third generation of Xeon scalable processors
When it comes to the data center market, I have to mention the Intel Xeon processor platform currently in the C-bit position. At present, Intel Xeon has shipped more than 30 million scalable processors, which is the most widely deployed data center platform in the world and still has 95% market share.
Looking at the roadmap of Intel’s Xeon processor products, we can see that the Xeon processor has a history of more than 20 years. The first generation of Xeon scalable processor was launched in 2017, and the second generation of Xeon scalable processor code named cascade lake was launched in 2018. This year, Intel’s third-generation Xeon scalable processor is coming out. The product has two product families, codenamed Cooper lake and ice lake. Cooper lake is positioned to support 4-8 processor slots in a system, that is, multiprocessors, while ice lake is positioned to support 1-2 processor slots in a system. Intel expects to launch the next generation of Xeon scalable processors, codenamed sapphire rapids, in the second half of next year.
Third generation Xeon scalable processors are designed for today’s data intensive services with built-in AI. Artificial intelligence and data intensive services are the common needs of the industry in recent years, and also the direction of technology development. For AI support, among the third generation of Xeon scalable processors, Intel has further upgraded DL boost deep learning acceleration technology. Meanwhile, vnni neural network instructions under the deep learning acceleration architecture can support the innovative bfload16 data format.
Combined with DL boost and bfload16, Intel’s third-generation Xeon scalable processor platform, compared with the top CPU 8280 of cascade lake, can improve the computing performance by 1.93 times in image classification processing, which is a very significant improvement. For computing intensive applications, Intel can support more CPU cores and higher CPU frequency, provide more memory channels, faster memory speed and higher memory capacity in the new platform. The third-generation Xeon scalable processor has stronger computing power and greater data storage capacity. For computing intensive applications, compared with the previous four-way platform, its computing performance is improved by 92%, and it supports the second-generation aoteng persistent memory.
At the same time, in view of the diversity of today’s cloud computing applications and enterprise applications, the third-generation Xeon scalable processor is equipped with the second-generation speed select technology (SST), which allows users to have more flexibility to configure the system and better meet business needs. SST is to solve the pain points encountered by many enterprise customers or Internet customers. Diversified services are becoming more and more complex, and have different requirements for hardware. Some services want the performance of single thread as high as possible, but it does not need many threads for parallel computing; In addition, some businesses want to have many threads for concurrent processing, but the requirement for single thread is not high. In the past, the way to deal with these different needs is to tailor the CPU and hardware configuration for the first class of applications. For another kind, you need to choose another CPU to use with a new machine. Although this method can solve the problem, if the business changes, the configuration is very inflexible.
In addition, there is another situation, as the computing density is higher and higher, the number of CPU cores is more and more, the memory is larger and larger, the storage capacity is higher and higher, and the network bandwidth is larger and larger. Users will deploy multiple services on one machine, and the priority of multiple services is different. Users often want multiple services to run on the same machine, and can set different priorities for different services to ensure better performance of high priority services. In the past, on the traditional platform, all the core priorities of a CPU are the same, and the available resources and frequencies are the same.
Based on this, Intel launched SST technology. As early as in the second generation of Xeon extensible processors, there is an early prototype of SST technology. In the third generation of Xeon extensible processors, SST is further expanded. SST is a set of functions. It provides four functional modes, including sst-pp, sst-cp, sst-bf and sst-tf, to solve the problems mentioned above.
At the same time, Intel technology experts told reporters that many innovations have been made in the architecture of the third-generation Xeon scalable processors. For example, the number of UPI bus ports has increased to six (that is, each slot has six interfaces), and there are two UPI buses between the two. The advantage of one more UPI is that it has higher bandwidth, which is conducive to supporting more CPU cores, larger memory and higher computing speed. It is equivalent to having a wider highway, allowing data to be transmitted between different slots and achieving synchronization. In the multiprocessor, this is a very important architecture innovation.
In terms of memory support, the third-generation Xeon scalable processor supports 6 memory channels per slot, and each channel can support a maximum speed of 3200mt / s. If we build a 4-way platform, we can have 24 channels. If we build an 8-way platform, we can have 48 channels. In terms of memory capacity, the third-generation Xeon scalable processor can support 16GB granular memory technology. If a single memory module uses ordinary rdimm, it can support 64GB, if it uses lrdimm, it can support 256g. If combined with aoteng persistent memory, each slot can support a maximum capacity of 4.5t. If we build a 4-way platform with 4 slots, the total maximum memory can reach 18t, and 8-way platform can reach 36t. Such large data capacity and high memory access bandwidth can support data intensive applications. In terms of I / O, the third-generation Xeon scalable processor can support 48 PCI 3.0 channels per slot. In the multiprocessor, each slot supports many channels, which can ensure the connectivity, speed and bandwidth of I / O. for most applications.
In addition, multiprocessors pay close attention to the RAS of the platform, that is, reliability, availability and maintainability. The third generation Xeon extensible processor provides very rich Ras support, which can deal with possible memory errors, errors on the PCI device, or errors in the CPU core itself, and achieve error isolation and fault diagnosis.
With the support of AI technology, Sky Lake, the first generation of Xeon extensible processor, provides avx-512 instruction set, which can be used for deep learning calculation in fp32 data format. Cascade lake, the second generation of Xeon scalable processor released in 2019, provides DL boost technology. DL boost contains vnni vector neural network instruction set. Vnni supports int8 data format to accelerate reasoning application of deep learning. This year, the DL boost deep learning acceleration technology of the third-generation Xeon extensible processor is further upgraded. Vnni instruction set can support the data format of bfload16, which can improve the performance of AI training and reasoning.
Bfloat16 is a simplified data format. Compared with today’s 32-bit floating-point number (fp32), bfloat16 can achieve the same level of model accuracy as fp32 by only half the number of bits and only a small degree of software modification. The newly added bfload16 supports the acceleration of AI training and reasoning performance of CPU at the same time. Among Intel’s toolsets, tensorflow, python, and mxnet can perfectly support bfloat16’s AI training.
In short, in terms of cloud computing, data analysis and mission critical workloads, the third generation of Xeon scalable processors can provide more cores, higher frequency, and support larger memory. For the application of data analysis, compared with the previous generation platform, the performance can be improved by 98%. For AI applications, with the upgraded DL boost technology and bfload16 data format, the training performance of AI can be improved by 93% compared with the previous generation, and the reasoning performance of AI can be improved by 90%. In the virtual machine density scenario of cloud computing, the number of cores supported by the third-generation Xeon scalable processor can be up to 28. If it is combined with an 8-way platform, it can easily support 224 physical cores, achieving very high density and helping users optimize TCO.
In addition to CPU, Intel also provides GPU, FPGA, special AI chips and other rich hardware products, and develops software solutions with industry ecological partners, giving customers a very complete product portfolio from chip to solution, meeting their needs for AI and analysis, the hottest applications today and the future computing direction.
Intel aoteng persistent memory further accelerates AI and data analysis
Facing the tide of big data, the realization of data value has to go through the process of data production, collection, extraction and calculation. In the future, the development of storage will be driven by the demand of workload. Modern storage system needs to be flexible, and storage technology is also evolving to meet the diverse needs. This also means that we need to improve the performance of storage, so as to shorten the distance between more data and processor.
To this end, Intel launched a new technology of aoteng persistent memory, which can keep more data in the memory (as the expansion or partial replacement of memory), so as to be closer to the CPU and achieve higher efficiency; At the same time, similar to the form of dual in-line memory module (DIMM), it provides memory computing speed close to DRAM (dynamic random access memory), and the price per GB is lower than DRAM, which also greatly reduces the cost of enterprises, and ultimately helps enterprises achieve a perfect balance between efficiency and cost.
Compared with having to use multiple servers with limited memory capacity, scale out can reduce the number of servers, the procurement of key components, and the management overhead of more server clusters, Intel aoteng persistent memory reduces the total cost of ownership from CAPEX and OPEX.
Aoteng’s persistent memory can support larger database, higher reliability and faster system recovery, and provide enough memory for the scenario of processor performance surplus and memory capacity shortage, so as to reduce the number of devices, software licensing costs, number of cabinets and energy consumption. By extending the existing memory capacity through persistent memory, the total cost of ownership (TCO) can be greatly reduced.
Intel aoteng’s persistent memory not only has the characteristics of large memory, low latency, persistence and high cost performance, but also has a variety of usage modes. It provides app direct mode (AD) and memory mode (mm), which can also support more environment and scenario applications.
Intel technology experts said that Intel’s introduction of aoteng technology to persistent memory has brought great technological innovation to the level of memory. Intel aoteng persistent memory adds a new storage level between memory and SSD, which has the same ultra-low access latency, ultra-high life and reliability as memory, and also has the ability of persistent storage and byte access. In AD mode, specific applications can directly access the independent persistent memory resources brought by Intel aoteng persistent memory.
Intel aoteng persistent memory can not only improve the TCO of the system, but also eliminate the I / O bottleneck and improve the performance, thus driving the generation of new memory and storage fusion applications, such as super fusion infrastructure, database, artificial intelligence or big data analysis. These are the places where Intel aoteng’s persistent memory can show its technical advantages.
As a part of the third-generation Xeon extensible platform, Intel also released the Intel aoteng 200 series of persistent memory, providing customers with up to 4.5tb of capacity per channel for data intensive workload management such as in memory database, intensive virtualization, analysis and high-performance computing. The access speed of CPU to persistent data provided by aoteng persistent memory 200 series is 200 times faster than that of mainstream NAND SSD. Compared with the first generation products, the average memory bandwidth of Intel aoteng persistent memory 200 series increased by 25%.
The last generation of aoteng’s persistent memory, combined with the second generation of Xeon scalable processor, can hold up to 6 512gb of persistent memory and 3TB of persistent memory in a single channel. The 200 series can provide up to 4.5tb of memory on the single channel third-generation Xeon scalable processor, including 3TB of aoteng persistent memory and ordinary DRAM memory. The single available capacity remains unchanged. There are three models of 128GB, 256gb and 512gb, which support the common memory slot with ordinary DRAM and the mixed plug of two kinds of memory.
Integrating AI to accelerate the transformation of industrial intelligence
From general-purpose CPU to GPU, from FPGA to ASIC, Intel’s “data centric” product portfolio continues to expand, supporting customers’ intelligent deployment from cloud, network, edge to end, laying the digital cornerstone of innovation in cloud computing, artificial intelligence, 5g network transformation, intelligent edge and other fields.
In the whole computing platform, especially in the data center, Intel Xeon platform has the best versatility and scalability, supporting a variety of AI tasks.
In the pipeline supporting the whole AI data processing, Intel Xeon platform can provide the most complete and sufficient support platform. The innovation of customer AI has been well supported on Intel Xeon platform.
As mentioned earlier, the third generation of Xeon scalable processor has optimized the performance and architecture for the current AI workload. And now ai not only has higher requirements for data computing, but also puts forward higher and higher requirements for data bearing.
Today, in the process of AI processing, we are usually combined with big data. In order to give full play to the performance of AI, Intel aoteng’s memory technology of persistent memory can play a very important role.
The third generation of Xeon scalable processor combined with Intel aoteng persistent memory can give full play to Intel’s storage performance in AI applications. In the process of combining AI computing and storage, the computing performance is getting better and better. At the same time, TCO can also achieve substantial optimization.
Intel Xeon platform’s support for AI performance is consistent, and Intel cooperates closely with ecological partners to carry out more innovative application practices based on the third generation of Xeon scalable processors. For example, Intel and ant financial services develop and deploy AI applications based on the deep learning model of 3d-cnn I3D video.
At the same time, Intel, together with Neusoft, Weining, yinggu, Huiyi Huiying and other industry partners, has implemented medical AI, and introduced AI technology in application scenarios such as medical imaging diagnosis, pathological section analysis, and drug research and development, so as to accelerate the integration of medical health and artificial intelligence.
In the short video market, Kwai is now entering the ranks of Top class Internet Co. AI plays an important role in its business. Kwai Kwai has launched a strong IT system to promote the continuous development of Intel’s business through developing Intel based FPGA memory, Intel strong and expandable processor and Intel’s application in fast app.
To sum up, the third generation of Xeon scalable processor is fully optimized for AI, and can be perfectly competent for AI computing tasks. At the same time, relying on the platform advantages and rich software support in the ecological chain, more and more industry users carry out AI application innovation on the Intel Architecture Platform. This innovation now occurs not only in the cloud, but also on the edge.
For more than 20 years, Intel has continued to promote innovation in the field of data center. With the advantages of combination of hardware and software, large-scale ability and deep cooperation with customers, Intel’s flexible and innovative products and solutions have been tested by customers and successfully verified in a wide range of applications.
With the deepening of data centric transformation, Intel will provide a comprehensive XPU chip platform based on the most powerful extensible platform integrated with AI acceleration, and join hands with industrial ecology to enable “the wise are stronger”, so as to change the world’s technology and benefit individuals, enterprises and society.
Editor in charge: PJ