ARM's presence in the server market can be said to be getting higher and higher. Whether it is a cloud-native processor for general computing or an AI/ML accelerator for inference training, ARM's participation is indispensable, offering an attractive opportunity for cloud service manufacturers. A cost-effective option.
However, it is one thing to not have a server-grade ARM processor, and it is another to whether cloud service providers use it or not. Although ARM has been active in the server field frequently, and new products have been released publicly in recent years, it is still too early for these ARM server chips to seize the share of x86 from the examples that have been deployed by cloud service vendors, especially since their optional quantity and scale are still It is not comparable to traditional x86 servers, and may be compared with heterogeneous instances such as GPUs, FPGAs, and NPUs.
Insist on self-developed Amazon
Amazon's AWS should be the first cloud service provider to introduce ARM servers. In 2015, when it acquired Annapurna Labs, Amazon embarked on the road of self-research. With its self-developed Graviton series of ARM processors, three generations of ARM processors have been deployed and disclosed since 2018. However, from Graviton to Graviton3, the increase of the main frequency is not particularly obvious, but it is increased from 2.3GHz to 2.6GHz, and the improvement in performance is quite considerable.
Graviton3 boost under real workloads / Amazon
On the contrary, the performance comparison project given by Amazon basically determines the application scope of this ARM processor. For example, the performance improvement of NGINX, Node.JS, and REDIS represents a web server; while x264, x265 encoding speed and AES-256 encryption The increase in speed represents media server and encryption applications; as for the improvement in machine learning, CPU alone may still be suitable for some reasoning work, but the overall competitiveness is still inferior to general-purpose GPU.
Microsoft's bumpy ARM road
Azure on Microsoft's side is more complicated. In the past, the ARM ecosystem of Microsoft's business was deeply bound to Qualcomm. Not to mention servers, Microsoft has stepped on the ARM pit on consumer notebooks, such as Qualcomm's SQ series processors, which have been criticized by users due to performance bottlenecks.
In 2017, Microsoft announced the Olympus project, which included two ARM chips using Cavium's ThunderX2 and Qualcomm's Centriq2400. However, with the acquisition of NUVIA by Qualcomm, it was ready to continue to develop ARM notebook chips, but it also withdrew from the server chip business, and Cavium stopped the development of Thunder series processors soon after it was acquired by Marvell.
Faced with such a hole, Microsoft seems to have made up its mind to fill it by itself. At the end of 2020, there was news of Microsoft's self-developed ARM server chips. For the high-profit cloud service business, self-developed processors will inevitably reduce costs again. However, the official release of the self-developed chip has not appeared for a long time, but it has ushered in the news that Azure uses the Ampere ARM processor.
Ampere Other / Ampere
Microsoft recently announced the launch of general-purpose instances Dpsv5 and memory-optimized instances Epsv5 based on Ampere Altra ARM processors, with a maximum frequency of 3.0GHz and a maximum of 64vCPU options. Microsoft says ARM instances are up to 50% more cost-effective than their x86 counterparts
A multi-pronged Ali
When it comes to using Ampere's ARM processor, Alibaba Cloud in China is actually a step ahead. As early as last year, Alibaba Cloud opened the test application for the AmpereARM server. The ARM server instances on Alibaba Cloud include the general-purpose instance g6r and the computing-type c6r, both of which are equipped with AmpereAltra processors. However, both are based on Alibaba Cloud's own third-generation Shenlong architecture, with a maximum optional 64 vCPU version.
g6r instance / Alibaba Cloud
According to the data given by Alibaba Cloud's official website, the main frequency of the g6r and c6r processors is 2.8GHz. After comparing with the official data of Ampere, it can be concluded that Alibaba Cloud uses the Q80-28 Ampere Altra processor. , TDP is 185W, which belongs to the third version of the 80-core Ampere Altra clock speed, but Alibaba Cloud only provides the option of 64vCPU at most. The highest specification version of Ampere Altra has a frequency of 3.3GHz, which is almost the same as the turbo frequency of the Intel XeonPlatinum 8369BPl 3.5GHz used by Alibaba Cloud's flagship g7 instance. As we mentioned earlier, the advantage of ARM processors is in cost. Compared with Intel x86 instances with the same vGPU and memory configuration, the price of Ampere Altra's ARM instances is 30% lower.
Etienne 710 / Alibaba
We can't forget the Yitian 710 chip released by Ali last year. This ARM chip based on the ARMv9 architecture supports up to 128 cores. It is obviously aimed at high-performance computing. However, this processor has not yet been officially deployed on public instances. Considering that the Yitian 710 is based on the 5nm process, it is likely to be due to production capacity issues.
ARM servers are still seen as a cost-effective option
Judging from the deployment of major cloud service providers, ARM servers are still regarded as a solution to reduce costs and energy consumption, and the main focus is still general-purpose computing, which cannot yet seize the x86 high-performance computing market. Most of the current ARM servers are used in web servers, application servers, small and medium-sized databases, game servers, and media servers where computing pressure is not high, while applications with high computing loads such as data analysis and batch computing are still dominated by x86. .
In addition, although ARM's server processors have seamlessly supported virtualization, they lack the multi-threading features of x86 processors, and one vCPU corresponds to one core. The reason why Ampere Altra has so many SKUs with different cores and different frequencies is because some ARM processors do not have dynamic frequency circuits, so the corresponding frequency is the maximum static clock frequency, and there is no dynamic frequency function like Intel Turbo.
Not only that, there are not many players of ARM server chips at present, and only Ampere can get orders from various cloud service providers. It serves its own cloud business, and Nvidia's ARM server processors are still a while away. It can also be seen that it is indeed difficult to be a third-party supplier in this market, otherwise Marvell and Qualcomm will not withdraw one after another.
Production capacity is also gradually affecting the speed of deployment. Amazon's Graviton3 was announced in November last year, but it has not been deployed so far; Ampere has also released AltraMax products with a maximum of 128 cores, but Alibaba Cloud and the newly announced Azure still use Altra, plus the Etian 710 mentioned above, ARM still has some way to go if it wants to be on an equal footing with x86 in the cloud.