Speech recognition system for Turkish language with hybrid method

Main Article Content

Cigdem Bakir

Abstract

Currently, technological developments are accompanied by a number of associated problems. Security takes the first place amongst such problems. In particular, biometric systems such as authentication constitute a significant fraction of the security problem. This is because sound recordings having connection with various crimes are required to be analysed for forensic purposes. Authentication systems necessitate transmission, design and classification of biometric data in a secure manner. The aim of this study is to actualise an automatic voice and speech recognition system using wavelet transform, taking Turkish sound forms and properties into consideration. Approximately 3740 Turkish voice samples of words and clauses of differing lengths were collected from 25 males and 25 females. The features of these voice samples were obtained using Mel-frequency cepstral coefficients (MFCCs), Mel-frequency discrete wavelet coefficients (MFDWCs) and linear prediction cepstral coefficient (LPCC). Feature vectors of the voice samples obtained were trained with k-means, artificial neural network (ANN) and hybrid model. The hybrid model was formed by combining with k-means clustering and ANN. In the first phase of this model, k-means performed subsets obtained with voice feature vectors. In the second phase, a set of training and tests were formed from these sub-clusters. Thus, for being trained more suitable data by clustering increased the accuracy. In the test phase, the owner of a given voice sample was identified by taking the trained voice samples into consideration. The results and performance of the algorithms used for classification are also demonstrated in a comparative manner.


Keywords: Speech recognition, hybrid model, k-means, artificial neural network (ANN).

Downloads

Download data is not yet available.

Article Details

Section
Articles