Speaker Recognition: Evaluation for GMM-UBM and 3D Convolutional Neural Networks Systems
[Thesis]
Alghamdi, Mohammad S.
Boult, Terrance
University of Colorado Colorado Springs
2019
73 p.
M.Eng.
University of Colorado Colorado Springs
2019
The Speaker Recognition (SR) systems are more accurate than ever in verifying and identifying the human voice which is one of the most convenient biometric characteristics of the human identity. Research and development on speaker recognition techniques have been varied widely in the last decade with an aim to lessen relevant challenges effects such as background noise, poor channel conditions, crosstalk, etc. In this paper, we evaluate two speaker verification (SV) systems, and each one uses an entirely different method to verify speakers: 1) ALIZE 3.0 which is an opensource platform for SR that was successfully passed the NIST Speaker Recognition Evaluations (SREs) implementing Gaussian mixture model (GMM)-UBM speaker. The other one is 2) 3D Convolutional Neural Network (3D-CNN) architecture, which uses a novel method for speaker verification based on Neural Network technique. This paper investigates how challenging it is to implement applications handling tasks in the field of speaker verification.