Problem statement
Language barriers pose a challenge in communication, especially in contexts where diverse languages are spoken. Existing speech-to-text systems often specialize in a single language, limiting their applicability in multilingual environments.
Abstract
The Speech-to-Text System for Multiple Languages project addresses language diversity by developing a system capable of converting spoken words into text across various languages. Through advanced natural language processing (NLP) techniques, the system aims to provide accurate and real-time transcriptions, fostering effective communication in multilingual settings.
Outcome
The outcome of this project is a versatile and adaptive speech-to-text system that supports multiple languages. Users can seamlessly convert spoken words into text, facilitating communication across language barriers. The system’s accuracy and flexibility make it valuable in diverse contexts, such as international conferences, language learning platforms, and accessibility tools, contributing to improved cross-cultural communication.
Reference
The current work presents a multilingual speech-to-text conversion system. Conversion is based on information in speech signal. Speech is the natural and most important form of communication for human being. Speech-To-Text (STT) system takes a human speech utterance as an input and requires a string of words as output. The objective of this system is to extract, characterize and recognize the information about speech. The proposed system is implemented using Mel-Frequency Cepstral Coefficient (MFCC) feature extraction technique and Minimum Distance Classifier, Support Vector Machine (SVM) methods for speech classification. Speech utterances are pre-recorded and stored in a database. Database mainly divided into two parts testing and training. Samples from training database are passed through training phase and features are extracted. Combining features for each sample forms feature vector which is stored as reference. Sample to be tested from testing part is given to system and its features are extracted. Similarity between these features and reference feature vector is computed and words having maximum similarity are given as output. The system is developed in MATLAB (R2010a) environment.