Academia, industry and media may have different views on the development level of artificial intelligence in today’s scientific and technological circles. I often hear a saying that AI based on big data and deep learning is a completely novel technical form. Its emergence can comprehensively change the social form of human beings in the future because it can “learn” independently, thus replacing a large number of human labor.
I think there are two misunderstandings here: first, deep learning is not a new technology; Second, the “learning” involved in deep learning technology is not the same as human learning, because it can not really “deeply” understand the information it faces.
Deep learning is not a new technology
From the perspective of technology history, the predecessor of deep learning technology is actually the “artificial neural network” technology (also known as “connectionism” technology) which has been popular for a while in the 1980s.
The essence of this technology is to build a simple artificial neural network structure by mathematical modeling, and a typical structure generally includes three layers: input unit layer, intermediate unit layer and output unit layer. After the input unit layer obtains information from the outside, it “decides” whether to send further data information to the intermediate unit layer according to the built-in aggregation algorithm and excitation function of each unit. The process is just like that after human neurons receive electrical pulses sent by other neurons, It can “decide” whether to deliver electric pulses to other neurons according to the change of potential potential potential in its own nucleus.
It should be noted that no matter whether the overall task performed by the whole system is about image recognition or natural language processing, the observer cannot know the nature of the relevant overall task only from the operation state of a single computing unit in the system. Rather, the whole system actually decomposes the identification task at the macro level into micro information transmission activities between the components of the system in the way of “breaking up the whole into parts”, and simulates the information processing process of human mind at the symbolic level through the general trend reflected in these micro information transmission activities.
The basic method for engineers to adjust the trend of micro information transmission activities of the system is as follows: first, let the system randomly process the input information, and then compare the processing results with the ideal processing results. If the two do not match well, the system triggers its own “back propagation algorithm” to adjust the connection weight between each computing unit in the system, so that the output given by the system is different from the previous output. The greater the connection weight between the two units, the more likely the “co excitation” phenomenon will occur between them, and vice versa. Then, the system compares the actual output with the ideal output again. If the coincidence between the two is still poor, the system starts the back propagation algorithm again until the actual output and the ideal output coincide with each other.
The system completing this training process can not only accurately classify the training samples, but also relatively accurately classify the input information close to the training samples. For example, if a system has been trained to recognize which photos in the existing photo library are Zhang San’s face, even a new photo that has never entered the photo library can be quickly recognized as Zhang San’s face by the system.
If the reader still doesn’t understand the above technical description, he might as well further understand the operation mechanism of artificial neural network technology through the following analogy. Suppose a foreigner who does not understand Chinese goes to Shaolin Temple to learn martial arts, how should the teaching activities between teachers and students be carried out? There are two situations: the first is that language communication can be carried out between the two (foreigners know Chinese or Shaolin master knows foreign language), so that the master can directly teach his foreign apprentices by “giving rules”. This educational method may be reluctantly compared with the number of rule-based artificial intelligence.
Another situation is that the language between the master and the apprentice is completely different. In this case, how should the students learn martial arts? The only way is as follows: the apprentice first observes the master’s action and then follows it. The master tells the apprentice whether the action is right or not through simple physical communication (for example, if it is right, the master smiles; If it’s not right, the master will stick it (Apprentice). Furthermore, if the master confirms an action of the apprentice, the apprentice will remember the action and continue to learn; If not, the apprentice has to guess what is wrong, give a new action according to this guess, and continue to wait for the master’s feedback until the master is finally satisfied. Obviously, such martial arts learning efficiency is very low, because apprentices will waste a lot of time guessing where their actions go wrong. However, the word “Hu guess” just hits the essence of the operation of artificial neural network. In short, such an artificial intelligence system actually does not know what the input information means – in other words, the designer of the system cannot communicate with the system at the symbolic level, just as in the previous example, the master cannot communicate with the apprentice. The reason why this “low efficiency” of inefficient learning can be tolerated in the computer is due to a great advantage of the computer compared with natural people: the computer can “guess” a large number of times in a very short physical time, and select a more correct solution. Once we see the mechanism clearly, it is not difficult to find that the working principle of artificial neural network is actually very clumsy.
“Deep learning” should be “deep learning”
So why does “neural network technology” now have the successor of “deep learning”? What does this new name mean?
We have to admit that “deep learning” is a confusing term, because it will induce many laymen to think that artificial intelligence systems can “deeply” understand their learning content like humans. But the truth is: according to the human “understanding” standard, such a system can not achieve the most superficial understanding of the original information.
In order to avoid such misunderstandings, the author is in favor of calling “deep learning” as “deep learning”. Because the real meaning of the English original “deep learning” technology is to upgrade the traditional artificial neural network, that is, to increase the number of hidden unit layers. The advantage of this is that it can increase the fineness of the information processing mechanism of the whole system, so that more object features can be settled in more middle layers.
For example, in the deep learning system of face recognition, more intermediate levels can more finely deal with the features at different abstract levels, such as primary pixels, color block edges, line combinations, facial features, etc. Such a delicate processing method can certainly improve the recognition ability of the whole system.
However, it should be noted that the mathematical complexity and data diversity of the whole system caused by such “in-depth” requirements will naturally put forward high requirements for computer hardware and the amount of data for training. This also explains why deep learning technology is becoming more and more popular after the 21st century. It is the rapid development of hardware in the computer field and the huge amount of data brought by the popularity of the Internet in recent decades that provide a basic guarantee for the landing and flowering of deep learning technology.
However, there are two bottlenecks that hinder the further “intelligence” of neural network deep learning technology:
First, once the system becomes convergent after training, the learning ability of the system decreases, that is, the system cannot adjust the weight according to the new input. This is not our ultimate ideal. Our ideal is: assuming that the network converges prematurely due to the limitations of the training sample database, it can still independently revise the original input-output mapping relationship in the face of new samples, and make this revision take into account the old history and new data. However, the existing technology can not support this seemingly grand technical idea. What designers can do now is to zero the historical knowledge of the system, bring new samples into the sample database, and then train from scratch. Here we undoubtedly see the chilling “Sisyphus cycle” again.
Second, as the previous example shows us, in the process of neural network deep learning pattern recognition, designers spend a lot of effort on feature extraction of original samples. Obviously, the same original sample will have different feature extraction modes in different designers, which will lead to different neural network deep learning modeling directions. For human programmers, this is a good opportunity to reflect their creativity, but for the system itself, this is tantamount to depriving itself of the opportunity to carry out creative activities. Imagine: can a neural network deep learning structure so designed observe the original samples, find the appropriate feature extraction mode, and design its own topology structure? It seems difficult, because it seems to require a meta structure behind the structure, which can give a reflective representation of the structure itself. We are still in a fog about how this meta structure should be programmed – because it is ourselves who realize the function of this meta structure. It is disappointing that although deep learning technology has these basic defects, the current mainstream AI community has been “brainwashed” and believes that deep learning technology is equal to all of AI. A more flexible and universal artificial intelligence technology based on small data obviously needs more efforts from people. From a purely academic point of view, we are still far from this goal.