Speech recognition technology is the most natural and concise way of human-computer communication. It is a high-tech technology that enables the machine to automatically recognize and understand the meaning of the speaker, and transform the speech signal into the correct text or command. According to the practical application, speech recognition can be divided into: specific person and non-specific person recognition, isolated word and continuous word recognition, small vocabulary and unlimited vocabulary recognition. Considering the factors of cost and application scope, this paper applies a speaker independent, isolated word and small vocabulary speech recognition system based on TMS320VC5509 DSP. Through the actual test, the speech recognition system using the DSP has high real-time performance and recognition rate, and the calculator based on the system has high accuracy for real-time digital calculation, which can basically solve the difficult situation of using calculator in special groups and special places.
Speech recognition process mainly includes speech signal preprocessing, feature extraction, pattern matching and other parts. After the input of speech signal, preprocessing and digitization are the precondition of speech recognition. Feature extraction is an essential step for speech signal training and recognition. The cepstrum parameters of Mel coefficients of each frame are extracted as the feature values of speech signal. At present, template matching algorithms include DTW algorithm, HMM hidden Markov model, ANN artificial neural network and so on. In this paper, HMM hidden Markov model is used, and the extracted eigenvalues are stored in the reference pattern library to match the eigenvalues of the speech signal to be recognized. Matching calculation is the core part of speech recognition. After feature extraction, the speech to be recognized is matched with the template generated during system training. In speaker recognition, the speech corresponding to the model with the greatest similarity to the speech to be recognized is taken as the recognition result.
System hardware structure
Figure 2 is the system hardware block diagram. The core device of this system is TMS320VC5509 fixed-point DSP of TI company. In this system, it is not only the core of speech recognition, but also the operation part of calculator. TMS320VC5509 is the operation processing unit of the system, which has two multipliers (MAC) and four accumulators (ACC); One 40 bit ALU and one 16 bit ALU, which greatly enhance the computing power of DSP; The instruction word length is not only 16 bits, but also up to 48 bits. The data word length is 16 bits; The program of TMS320VC5509 can be written through USB interface without the aid of emulator. Based on these advantages, the device can save development funds and reduce the area of circuit board. The interface circuit between DSP and TLV320AIC23 is shown in Figure 3.