2020 is an extraordinary year. Profound changes are taking place in the global health, trade, economy, culture, politics and science and technology. The author’s field of science and technology coincides with the 10th anniversary of the development of modern artificial intelligence (AI). In the first 10 years, artificial intelligence technology has made great progress, but there are still many problems to be solved. So, how will AI technology develop in the future? Based on the research results of academic and industrial circles, as well as personal research opinions, this paper explores and prospects the future of AI development with readers from the four dimensions of computing power, data, algorithm and engineering.
We first analyze the trend of data development. Data to AI is just like food ingredients to delicious dishes. In the past 10 years, the acquisition of data, whether in terms of quantity, quality, or type of data, has increased significantly, supporting the development of AI technology. In the future, what are the trends in the development of data level? Let’s look at a group of analytical data.
First of all, the number of Internet users in the world has reached the order of one billion. With the further development of the Internet of things and 5g technology, more data sources and transmission capabilities will be improved. Therefore, it can be predicted that the total amount of data will continue to develop rapidly, and the growth rate will be accelerated. The total amount of data is expected to grow from 33zb (1zb = 106gb) in 2018 to 175zb in 2025.
Secondly, the data storage location is predicted by the industry to be mainly centralized storage, and the proportion of data using public cloud storage will increase year by year, as shown in Figure 2 and figure 3.
The above trends for future data can be summarized as follows: the number of data continues to grow; cloud centralized storage is the main; public cloud penetration continues to grow. From the perspective of AI technology, it can be expected that the continuous supply of data is guaranteed.
On the other hand, AI technology needs not only raw data, but also annotated data. Annotation data can be divided into three categories: automatic annotation, semi-automatic annotation and manual annotation.
So, what is the future trend of tagging data?
We can get a glimpse of the trends in the tagging data tools market. It can be seen that in the next 5-10 years, the probability rate will still be the main source of annotation data, accounting for more than 75%.
Through the analysis and prediction of the above data dimensions, we can get the judgment that the amount of data itself will not limit AI technology, but the cost and scale of manual tagging are likely to restrict the development of AI technology, which will force AI technology to make a breakthrough from algorithm and technology itself and effectively solve the dependence on data, especially manual annotation data.
Let’s take a look at computing power. Computing power is an infrastructure support for AI technology, just as kitchen stoves are for delicious food.
Computing power refers to the hardware computing power required to implement AI system. The development of semiconductor computing chips is the fundamental driving force of AI computing power. The good news is that although the development of the semiconductor industry has ups and downs, and has been accompanied by doubts about sustainability, the famous “Moore’s law” of the semiconductor industry has withstood the test of 120 years, and it is believed that it will continue to develop smoothly in the next 5-10 years.
However, it is worth noting that Moore’s law is still maintained in the field of computing chips, largely because the rapid development of graphics processing units (GPUs) has made up for the slowdown in the development of general purpose processors (CPU). It can be seen that the number of transistors in GPU has exceeded that of CPU, and CPU transistors have begun to lag behind Moore’s law.
Of course, the number of semiconductor transistors can reflect the overall trend, but it is not accurate enough to reflect the development of computing power. For AI systems, floating-point operation and memory are more direct indicators of computing power. Next, we will compare the performance of GPU and CPU, as shown in Figure 7. It can be seen that GPU has greatly surpassed CPU in terms of computing power and memory access speed in recent 10 years, which has filled the bottleneck problem of CPU performance development.
On the other hand, according to the data sorted out by the prospective industry research institute, in terms of the revenue scale of AI chips in 2019, GPU chips account for about 27%, while CPU chips only account for 17%. It can be seen that GPU has become the standard hardware computing configuration in the field of artificial intelligence represented by deep learning technology. The reason for its formation is also very simple. The existing AI algorithms, especially in the model training stage, demand for computing power continues to increase, and GPU computing power is just a lot stronger than CPU. At the same time, it is a kind of general computing equipment with low coupling degree with AI algorithm model itself.
In addition to GPU and CPU, other computing devices such as ASIC, FGPA and other emerging AI chips are also developing, which is worthy of industry attention. In view of the fact that the data rate is still stored in the cloud in the future, whether these chips can improve the performance and efficiency, ensure the universality, and can be deployed by cloud manufacturers in scale, and obtain the support of software ecology, remains to be further observed.
Now let’s analyze the algorithm. AI algorithm for artificial intelligence is the relationship between cooks and delicacies. With the development of AI in the past 10 years, data and computing power have played an important role. However, it is undeniable that the performance breakthrough of deep learning algorithm combined with its application is an important reason for AI technology to achieve milestone development stage in 2020.
So, what is the development trend of AI algorithm in the future? This issue is one of the core issues discussed by academia and industry. A common consensus is that the development of AI technology in the past 10 years has benefited from deep learning, but the computational power problems brought about by the development of this path are difficult to sustain. Let’s look at a graph and a set of data:
OneAccording to the latest calculation of openai, the computing power of training a large-scale AI model has increased by 300000 times since 2012, that is, the annual average growth rate is 11.5 times, while the hardware growth rate of computing power, namely Moore’s law, only reaches 1.4 times of the annual average growth rate; on the other hand, the improvement of algorithm efficiency saves about 1.7 times of computing power annually. This means that as we continue to pursue the continuous improvement of algorithm performance, the average annual computing power deficit is about 8.5 times, which is worrying. A practical example is gpt-3, a natural semantic pre training model recently released this year. Only the training cost has reached about 13 million US dollars. Whether this method is sustainable or not is worthy of our consideration.
2. MIT’s latest research shows that for an AI model with over parameterization (i.e., the number of parameters is more than the training data samples), it satisfies a theoretical upper bound formula.The formula shows that the computing power requirement is equal to or greater than the fourth power of the performance requirement in the ideal situation. Since 2012, the model has been analyzed on the Imagenet data set. The reality is that it fluctuates up and down on the level of the 9th power, which means that the existing algorithm research and implementation methods have a lot of optimization space in efficiency.
3. According to the above data calculation, the error rate of artificial intelligence algorithm in Imagenet is estimated to cost 100 billion US dollars (10 to the 20th power), and the cost is unbearable.
Combined with the analysis of the data and computing power mentioned above, I believe readers can find that the high cost of labeling data cost and computing power cost in the future means that the data dividend and computing power bonus are gradually fading, and the core driving force of the development of artificial intelligence technology will mainly rely on the breakthrough and innovation at the algorithm level in the future. According to the latest academic and industrial research results, the author thinks that the development of AI algorithm in the future may have the following characteristics:
(1) Combination of prior knowledge representation and deep learning
Throughout the development history of artificial intelligence for more than 70 years, semiotics, connectionism and behaviorism are three academic schools formed in the early stage of AI development. Nowadays, connectivism, represented by deep learning, has become the mainstream of development in the past 10 years. Behaviorism has made a major breakthrough in the field of reinforcement learning. The achievements of alphago, the brain of go, have been widely known.
It is worth noting that the three independent schools of thought are starting to integrate technology with deep learning as the main line. For example, in 2013, the field of reinforcement learning invented dqn network, which adopted neural network, opened a new research field called deep reinforcement learning.
Then, will semiotic algorithms be integrated with deep learning? A popular candidate is graph network technology, which is merging with deep learning technology to form the research field of deep graph network. The data structure of graph network is easy to express human’s prior knowledge, and it is a more general and reasoning ability (also known as inductive bias) information expression method, which may be a key to solve the problem of deep learning model data hunger, reasoning ability and the output of the lack of explainability.
(2) The model structure is based on Bioscience
The structure of deep learning model is composed of forward feedback and back propagation. Compared with biological neural network, the structure of the model is too simple. Whether deep learning model structure can draw inspiration from the progress and discovery of biological science and biological neuroscience, so as to find more excellent models is a field worthy of attention. On the other hand, how to add uncertain parameter modeling to the deep learning model, so that it can deal with the random uncertainty better, is also a potential breakthrough area.
(3) Data generation
AI model training relies on data, which is not a problem at present, but AI model training relies on manual annotation data, which is a headache. It is a hot research field to use algorithms to effectively solve or greatly reduce the dependence of model training on manual annotation data. In fact, DARPA, which has been looming in the development of artificial intelligence technology, has set this field as one of the objectives of its ai3.0 development plan, which shows its importance.
(4) Model self assessment
The research and development mode of existing AI algorithms, whether machine learning algorithms or deep learning algorithms, is essentially carried out through closed loop and open loop. Whether the open-loop system can be evolved into a closed-loop system through the design model self-evaluation is also a research field. The performance of closed-loop control systems in other fields can be significantly reduced by using open-loop algorithms. These characteristics of closed-loop system provide an idea and method to improve the robustness and antagonism of AI system.
The data, computing power and algorithm of artificial intelligence have been sorted out and analyzed. Finally, let’s look at engineering. Engineering for artificial intelligence, just like kitchenware for delicacies, is a medium that combines data, computing power and algorithms.
The essence of engineering is to improve efficiency, that is to maximize the use of resources, minimize the loss of conversion between information. To make a simple analogy, if you want to make delicious dishes, you need ingredients, kitchen stoves, and chefs. But if you don’t have the right kitchenware, then the chefs can’t play their cooking skills (algorithms), process the ingredients (data), and use the water, electricity, and electricity of the kitchen stove. Therefore, it can be predicted that the future development of engineering is one of the important means to approximate the relationship between computational power and algorithm performance from the present 9th power to the 4th power of the theoretical upper limit.
In the past 10 years, AI engineering development has formed a clear tool chain system. Recently, there are some noticeable changes. The author summarizes some obvious trends as follows:
In conclusion, AI engineering is forming a complete set of tool chain with Python as programming language from client to cloud. Its three important features are: remote programming and debugging, GPU acceleration support of deep learning and machine learning, and decoupling of model training and reasoning tool chain. At the same time, a large amount of investment in the open source community by upstream manufacturers of the industrial chain will bring technological dividends to the middle and lower reaches of enterprises and individuals, and reduce their R & D threshold and cost. The author thinks that the open source tool chain mainly promoted by Microsoft, Facebook and NVIDIA deserves special attention.
Some people attribute the achievements of artificial intelligence technology in the past 10 years to data and computing power. In the future, the author boldly predicts that the algorithm will be the core driving force for the development of artificial intelligence technology. At the same time, the actual efficiency of algorithm research and development, in addition to the algorithm structure itself, also depends on the designer’s mastery of advanced tool chain.
In the next 10 years, will the scientific and technological community be able to achieve real universal intelligence with less data and more economic computing power? We’ll see.
Editor in charge: Tzh