AI chips are based on big data in the cloud, and the core is responsible for “training”. The cloud is characterized by “big data + cloud computing”. Relying on big data, users can conduct sufficient data analysis and data mining, extract various data features, and fully combine with artificial intelligence algorithms for cloud computing, thus deriving various AI + applications on the server side. AI chip is the hardware responsible for accelerating various complex algorithms of artificial intelligence. Due to the huge amount of related computing, the CPU architecture has been proved to be unable to meet the artificial intelligence algorithms that need to process a large number of parallel computing, and needs chips more suitable for parallel computing. Therefore, GPU, FPGA, TPU and other chips came into being. AI chips can simultaneously undertake the “training” and “inference” processes of artificial intelligence in the cloud.
Status of cloud chip: GPU occupies the leading market of cloud artificial intelligence. At present, ASIC represented by TPU is only used in the closed-loop ecology of giants, and FPGA develops rapidly in data center business. GPU has short application development cycle, relatively low cost and mature technical system. At present, cloud computing centers of major companies around the world, such as Google, Microsoft, Amazon, Alibaba and other mainstream companies, all use GPU for AI computing.
In addition to extensive use of GPU, Google strives to develop its own AI exclusive ASIC chip. Compared with GPU, the power consumption of TPU launched in May this year is reduced by 60% and the chip area is reduced by 40%, which can better meet its huge AI computing power requirements. However, due to the rapid iteration of artificial intelligence algorithm, TPU is only used by Google itself. Later, with the maturity of tensorflow, TPU can also be supplied externally, but its universality still has a long way to go.
Baidu and other manufacturers are also actively using FPGA for cloud acceleration in data center business. FPGA can be regarded as a key transition scheme from GPU to ASIC. Compared with GPU, it can go deep into hardware level optimization. Compared with ASIC, it has more flexibility and shorter development time under the condition of continuous iterative evolution of current algorithms. AI domain specific architecture chip (ASIC) has been proved to have better performance and power consumption, and is expected to become the mainstream direction of artificial intelligence hardware in the future.
GPU is currently the most widely used in the cloud. At present, a large number of enterprises involved in artificial intelligence adopt GPU for acceleration. According to the official data of NVIDIA, more than 19000 companies cooperated with NVIDIA to develop in-depth learning projects in 2016, compared with 1500 in 2014. At present, it giants such as Baidu, Google, Facebook and Microsoft use NVIDIA’s GPU to accelerate their AI projects. At present, GPU is most widely used in cloud AI deep learning scenarios. Due to the first mover advantage brought by its good programming environment, it is expected to continue to be strong in the future.
GPU chip architecture, unborn image processing, strong parallel computing power. GPU (graphics processing unit), also known as visual processor, is a microprocessor previously used in chips such as personal computers, workstations, game consoles, mobile devices (such as tablets, smart phones, etc.) and specially used for image computing. Similar to CPU, it can be programmed, but compared with CPU, CPU is more suitable for performing complex mathematical and geometric calculations, especially parallel operations. It has a high parallel structure, which is more efficient than CPU in processing graphics data and complex algorithms.
GPU is obviously different from CPU structure, which is more suitable for parallel computing. Compared with the structural differences between GPU and CPU, most of the CPU area is controller and register, and GPU has more Alu (arithmetic logic unit) for data processing rather than data cache and flow control. This structure is suitable for parallel processing of intensive data. When the CPU executes computing tasks, it only processes one data at a time, and there is no real parallelism, while the GPU has multiple processor cores, which can process multiple data in parallel at the same time.
Compared with CPU, GPU has absolute advantages in AI performance. Deep learning in neural network training requires high internal parallelism, a large number of floating-point computing capabilities and matrix operations, which can be provided by GPU. Under the same accuracy, GPU has faster processing speed, less server investment and lower power consumption compared with the traditional CPU. NVIDIA has released Tesla V100 at the GPU Technology Conference in San Jose, California on May 11, 2017. Volta, the GPU computing architecture with the strongest performance at present, adopts TSMC 12NM FFN process and integrates 21 billion transistors, which is equivalent to 250 CPUs in processing deep learning.