Robot speech recognition and natural language understanding algorithm based on deep learning
Download as PDF
DOI: 10.25236/icceme.2024.003
Corresponding Author
Xingnuo Wang
Abstract
This paper studies the robot speech recognition and natural language understanding algorithm based on deep learning. This does not only involve a systematic process of designing systems and processing data but also training the models so that the feature of speech recognition and natural language understanding is effective and accuracy. For the extraction of characteristics, the Mel frequency cepstral coefficient (MFCC) is applied in the processing of the speech signals. The deep neural network (DNN), coupled with the convolutional neural network (CNN) and long short-term memory (LSTM) network methodologies, applies an extensive high training set with the view of attaining a recognition accuracy rate. The language model and transformer model are effective text data models that make the data more understandable. In fact, over the past few decades, technologies based on speech recognition and natural language understanding have advanced enormously. The system architecture includes data acquisition module, signal processing module, deep learning training module and result analysis module. Through a series of experimental verifications, the system performs well in many performance indicators, especially the transformer model in accuracy, precision, recall rate and F1 score. The research results show that the application of deep learning models in speech recognition and natural language understanding has significant advantages, providing a solid foundation for the development of robot voice interaction technology.
Keywords
deep learning, speech recognition, natural language understanding, robot interaction, Transformer model