Acoustic-Phonetic Approaches for Improving Segment-Based Speech Recognition for Large Vocabulary Continuous Speech

Authors

  • Krerksak Likitsupin Chulalongkorn University
  • Proadpran Punyabukkana Chulalongkorn University
  • Chai Wutiwiwatchai National Electronics and Computer Technology Center
  • Atiwong Suchato Chulalongkorn University

DOI:

https://doi.org/10.4186/ej.2016.20.2.179

Keywords:

Segment-based speech recognition, distinctive features, distinctive features-based speech recognition, speech recognition.

Abstract

Segment-based speech recognition has shown to be a competitive alternative to the state-of-the-art HMM-based techniques. Its accuracies rely heavily on the quality of the segment graph from which the recognizer searches for the most likely recognition hypotheses. In order to increase the inclusion rate of actual segments in the graph, it is important to recover possible missing segments generated by segment-based segmentation algorithm. An aspect of this research focuses on determining the missing segments due to missed detection of segment boundaries. The acoustic discontinuities, together with manner-distinctive features are utilized to recover the missing segments. Another aspect of improvement to our segment-based framework tackles the restriction of having limited amount of training speech data which prevents the usage of more complex covariance matrices for the acoustic models. Feature dimensional reduction in the form of the Principal Component Analysis (PCA) is applied to enable the training of full covariance matrices and it results in improved segment-based phoneme recognition. Furthermore, to benefit from the fact that segment-based approach allows the integration of phonetic knowledge, we incorporate the probability of each segment being one type of sound unit of a certain specific common manner of articulation into the scoring of the segment graphs. Our experiment shows that, with the proposed improvements, our segment-based framework approximately increases the phoneme recognition accuracy by approximately 25% of the one obtained from the baseline segment-based speech recognition.

Downloads

Download data is not yet available.

Author Biographies

Krerksak Likitsupin

Department of Computer Engineering, Faculty of Engineering, Chulalongkorn University, Bangkok 10330, Thailand

Proadpran Punyabukkana

Department of Computer Engineering, Faculty of Engineering, Chulalongkorn University, Bangkok 10330, Thailand

Chai Wutiwiwatchai

Human Language Technology Laboratory, National Electronics and Computer Technology Center, Pathum Thani 12120, Thailand

Atiwong Suchato

Department of Computer Engineering, Faculty of Engineering, Chulalongkorn University, Bangkok 10330, Thailand

Published

Vol 20 No 2, May 18, 2016

How to Cite

[1]
K. Likitsupin, P. Punyabukkana, C. Wutiwiwatchai, and A. Suchato, “Acoustic-Phonetic Approaches for Improving Segment-Based Speech Recognition for Large Vocabulary Continuous Speech”, Eng. J., vol. 20, no. 2, pp. 179-197, May 2016.