The text is from official account.



The biggest interaction in the future is not the current human-computer interaction, but the interaction between human and artificial intelligence. Artificial intelligence industry has made a lot of achievements, and has gradually entered tens of millions of families, accompanied by many people, including intelligent voice robots. In this paper, the author will take the field of intelligent customer service as the starting point, combined with their own work practice for detailed analysis.

On March 4, the Standing Committee of the Political Bureau of the CPC Central Committee held a meeting and pointed out that we should accelerate the progress of new infrastructure construction, including 5g network, artificial intelligence, data center and so on.

Combined with the important role of AI enabling technology in epidemic prevention and control and the new infrastructure policy, it can be predicted that AI will usher in a new round of development in the future.

According to the “Research Report on China’s artificial intelligence industry in 2019” released by iResearch, by 2022, the scale of China’s intelligent customer service business will exceed 16 billion yuan, and the pan intelligent customer service market will exceed 60 billion yuan, which also indicates that there is still a broad market space in this field.

In this paper, the author will focus on the application of intelligent customer service in this more mature field, combined with their own work practice for detailed analysis.

1、 Background of intelligent customer service

Based on the traditional customer service system, intelligent voice customer service robot integrates many intelligent interaction technologies, such as voice recognition, semantic understanding, knowledge mapping and deep learning. It can accurately understand the user’s intention or questions, and then give users satisfactory answers according to the rich content and massive knowledge mapping. At present, it has been widely used in finance, insurance, automobile, real estate and e-commerce And government.

Compared with traditional customer service, intelligent customer service has the advantages of reducing cost and increasing efficiency, improving business opportunity conversion rate, improving user experience, more convenient and concise, mobility, timeliness and social performance integration.

The application scenarios of intelligent customer service robot are very rich,

From the perspective of interaction mode, it can be divided into two categories: text customer service robot and voice customer service robot;

From the perspective of scene and function type, it can be divided into question answering robot, task robot and chat robot.

So how to apply voice outbound robot in the actual scene?

Next, I will make a detailed description from four modules: the workflow of voice outbound robot, the construction of outbound system, application cases, and the key and difficult points of application.

2、 Workflow of intelligent outbound robot

AI outbound robot is a multi-functional intelligent voice dialogue robot, which integrates automatic call making, multi round voice interaction, intelligent grading of customer intention, and outbound task customization.

The following is a basic workflow of intelligent outbound robot:

As shown in the figure above, a complete intelligent outbound call process (not involving manual transfer) consists of four links, each of which will be operated in series by the outbound call system as a whole

User answering: at the beginning of the outbound call workflow, the outbound call system needs to recognize the user answering signal.

Client robot response: the key of this link is the strategy output. The outbound call system needs to identify the user’s intention or action according to the user’s response, and give the response script according to the robot’s preset task flow and strategy.

User response / the first mock exam is to identify the user’s intentions and actions accurately, and to record user status so as to implement the next step strategy.

User / customer service robot hang up: when the robot finishes the task flow, it will hang up actively, or the user will hang up autonomously in advance, and the outbound workflow will end.

3、 Design of outbound call system

The implementation of the above workflow depends on the outbound call system and involves multi-party technology. The following is an overview of the underlying architecture of the outbound call system.

The figure above shows the call system architecture sorted out by the author in combination with the actual business logic of the robot, as shown in the figure. On the whole, the voice outbound call system can be divided into five modules:

1. Communication management module

It is composed of communication line and freeswith telephone system. Through SIP and RTP protocol, it realizes the transmission of various signaling and voice streams. Among them, communication lines include three major operators and various integrated line operators, which are used to provide line resources to make calls.

Freeswith is an open source telephone system, which is mainly used to process outbound calls and transmit SIP signaling and voice streams.

2. Voice module

Responsible for speech related operations, including speech recognition (ASR), speech synthesis (TTS), recording and playing, etc.

ASR and TTS generally use the services of Alibaba cloud, iFLYTEK and other mature technology providers, mainly through the interface form.

3. Central control module

The main task is to realize the communication interconnection with other modules, transfer the text recognized by ASR to the robot module, transform the instruction strategy of the robot module into the execution instruction of the telephone system, and synchronize the data to the SaaS background (the name of the central control module varies from company to company).

4. Background management module

Responsible for the robot outbound task initiation and related business operations, mainly including outbound task creation, call flow query, customer management, data statistics and other functions.

5. Robot management module

This is the core AI module in the whole outbound call process. Through natural language processing (NLP) and conversation management (DM), user intention understanding, conversation status tracking, robot response strategy matching and so on, human-computer interaction is realized.

As for the complexity of NLP and DM modules, the author will separately elaborate the design of the dialogue system of task robot in the next article, but there are no more supplements here.

4、 Application cases

Based on the business scenario of 58 second-hand cars in the same city, this paper analyzes how the outbound robot works through the outbound system, and how the various modules of the outbound system are coupled to meet the business requirements.

1. Dialogue management design

Under normal circumstances, once the outbound call business scenario is determined, the product needs to sort out the main process of the task scenario, select the depth intention, set matching QA, set slots, prepare scripts, design dialogue status tracking, design dialogue strategy and so on.

The design and configuration of dialogue management here involves the robot management module in the outbound call system.

For example, the robot scripts in the dialog box above are designed in advance according to the second-hand car return visit business.

2. Create outbound call task

After the dialog management module is fully configured, the business personnel can create the outbound call list in the SaaS background, and the communication management module can receive the task instructions and pull the call list for outbound call.

3. Dialing process

The dialing process involves many modules, including communication management module, voice module, central control module and robot management module.

According to the outbound call task created by the operators, the communication lines of the operators begin to dial the users’ calls one by one;

After the user gets on the phone, he begins to enter the dialogue processing cycle;

FS of communication management module transmits user voice stream to voice module for ASR recognition as text information, and then transmits action / text information to central control module together;

The central control module pushes the user text / action information to the robot module, and converts the strategy instruction returned by the robot into the execution instruction of the telephone system;

The telephone system combines with the voice module to perform speech synthesis, and then implements the robot action strategies such as voice play or manual transfer, hang up and so on, and then starts a new round of dialogue cycle processing flow;

After the robot / user hangs up, the central control module stores and synchronizes the relevant recording files, system information, status information and other data to the management background.

5、 Key and difficult points of intelligent outbound robot application

We consider the outbound call quality of an outbound call robot from two aspects, one is to ensure the smoothness of the outbound call process, the other is to ensure the completion rate of the outbound call task.

There are many factors that affect the quality of robot outbound calls. From the perspective of products, in addition to the uncontrollable factors such as the accuracy of target customers, the environment in which customers answer the phone, and the status of customers, it is mainly limited by the following aspects:

1. Stability of telephone line

Most of the reasons for call failure are due to the instability of the line provided by the supplier.

In order to avoid this problem, it is more important to apply for the lines of basic operators or find the certification suppliers of formal channels to ensure the line quality.

2. Concurrency of freeswith

The specific performance of freeswith varies greatly according to the actual use environment. If the FS concurrency is set too low due to insufficient early prediction, if it exceeds the concurrency, there will be abnormal calls or voice jamming.

The concurrency number should be considered from the actual business requirements of the system to ensure the stability of FS performance.

3. ASR recognition accuracy

Although the speech recognition rate indicated by many suppliers has reached 97% or even 98%, this index has higher requirements for the environment.

In the actual environment, the ASR recognition accuracy decreases to some extent in noisy, accented, mixed language scenes.

4. Semantic understanding

In dialogue robot, language understanding (NLU) module mainly includes intention recognition and slot recognition, which directly affect the effect of semantic understanding.

In the speech scene, users often reply to monolingual words, such as “um”, “ah”, etc., or have special intentions, such as “speak louder”, “speak faster”, and require “repeat”, etc. in the intention design, such special fields and response strategies should be considered.

The ASR recognition error mentioned above will affect the semantic understanding part. At present, we can adopt the optimization scheme of adding multi-modal learning and fusing audio features to correct the speech recognition results. This scheme will improve the accuracy of intention recognition module by nearly 2% after verification.

5. Rationality of dialogue management module design

The rationality of robot dialogue management module design directly determines the experience and completion rate of the whole call task process.

Dialogue management module focuses on the rationality of dialogue state tracking (DST) and dialogue strategy design (DPL), such as: interruption, silence and other voice specific scenarios, how to improve the user experience and ensure the normal flow of outbound calls.

6. The rationality of script design

Script design is also a very important part of speech task robot design, in order to improve the user experience.

The design of script can follow the following principles:

The script design is more suitable for application scenarios;

The design of trunk voice is simple and attractive;

The personification of language skills;

The change of script in different states.

6、 Conclusion

At present, with the continuous progress of AI technology and the further expansion of market demand, the performance of intelligent voice robot in practical application scenarios is also getting better and better, and gradually can be competent for more business work.

However, the difficulties still exist. It is expected that with more AI technology in the future, the ability of intelligent customer service robots will be improved to a greater extent, so that we can experience more intimate and intelligent robot services in our life.



[this article was originally published by @ Cen Wei. Everyone is a product manager. Thank you very much! 】

Leave a Reply

Your email address will not be published. Required fields are marked *