Premananda B.S.; Uma B.V.

Dominant Frequency Enhancement of Speech Signal to Improve Intelligibility and Quality

Full Text (PDF, 530KB), PP.29-37

Views: 0 Downloads: 0

Author(s)

Premananda B.S. ^1,* Uma B.V. ²

1. Department of Telecommunication, R.V. College of Engineering, Bengaluru, India

2. Department of Electronics & Communication, R.V. College of Engineering, Bengaluru, India

* Corresponding author.

DOI: https://doi.org/10.5815/ijigsp.2015.06.04

Received: 25 Dec. 2014 / Revised: 27 Feb. 2015 / Accepted: 3 Apr. 2015 / Published: 8 May 2015

Index Terms

Dominant, Near-end noise, Psychoacoustics, Speech enhancement, Speech intelligibility, Speech quality

Abstract

In mobile devices, perceived speech signal deteriorates significantly in the presence of near-end noise as the signal arrives directly at the listener's ears in a noisy environment. There is an inherent need to increase the clarity and quality of the received speech signal in noisier environment. It is accomplished by incorporating speech enhancement algorithms at the receiver end. The objective is to improve the intelligibility and quality of the speech signal by dynamically enhancing the speech signal when the near-end noise dominates. This paper proposes a speech enhancement approaches by inculcating the threshold of hearing and auditory masking properties of the human ear. Incorporating the masking properties, the speech samples that are audible can be obtained. In low SNR environments, selective audible samples can be enhanced to improve the clarity of the signal rather than enhancing every loud sample. Intelligibility and quality of the enhanced speech signal are measured using Speech Intelligibility Index and Perceptual Evaluation of Speech Quality. Experimental results connote the intelligibility and quality improvement of the speech signal with the proposed method over the unprocessed far-end speech signal. This approach is efficient in overcoming the deterioration of speech signals in a noisy environment.

Cite This Paper

Premananda B.S., Uma B.V.,"Dominant Frequency Enhancement of Speech Signal to Improve Intelligibility and Quality", IJIGSP, vol.7, no.6, pp.29-37, 2015. DOI: 10.5815/ijigsp.2015.06.04

Reference

[1]Premananda B. S., and Uma B. V., "Speech Enhancement Algorithm to Reduce the Effect of Background Noise in Mobile Phones", International Journal of Wireless and Mobile Networks (IJWMN), Vol. 5, No. 1, pp. 177 - 189, Feb. 2013.

[2]Premananda B. S., and Uma B. V., "Low Complexity Speech Enhancement Algorithm for Improved Perception in Mobile Devices", International Workshop on Wireless and Mobile Networks, WiMoNe-2012, Lecture Notes in Electrical Engineering, Vol. 131, Springer, pp. 699 - 707, Feb. 2013.

[3]Premananda B. S., and Uma B. V., "Speech Enhancement to Overcome the Effect of Near-end Noise in Mobile Phones using Psychoacoustics", 5th IEEE International Conference on Computing, Comm. and Networking Technologies (ICCCNT), Hefei, China, IEEE – 33044, DOI:10.1109/ICCCNT.2014.6963017, pp. 1-5, July 2014.

[4]Bastian Sauert and Peter Vary, "Near End Listening Enhancement Considering Thermal Limit of Mobile Phone Loudspeakers," Proceedings of Elektronische Sprach Signal Verarbeitun (ESSV), Vol. 61, Germany, pp. 333–340, Sept. 2011.

[5]Bastian Sauert and Peter Vary, "Near End Listening Enhancement: Speech Intelligibility Improvement in Noisy Environments", Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, France, pp. 493 - 496, 2006.

[6]Bastian Sauert and Peter Vary, "Near-End Listening Enhancement Optimized with respect to Speech Intelligibility Index and Audio Power Limitations", Proceedings of European Signal Processing Conference, Aalborg, Denmark, pp. 1919 - 1923, August 2010.

[7]Taal C. H., Jensen J., and Leijon A., "On Optimal Linear Filtering of Speech for Near-End Listening Enhancement", IEEE Signal Processing Letters, Vol. 20, No. 3, pp. 225 - 228, March 2013.

[8]C. H. Taal and R. Heusdens, "A Low-complexity Spectro-temporal based Perceptual Model," in IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 153–156, 2009.

[9]Jeon Yu-young and Lee Sang-min, "A Speech Enhancement Algorithm to Reduce Noise and Compensate for Partial Masking Effect", Journal of Central South University of Technology, Vol. 18, issue 4, pp. 1121 - 1127, August 2011.

[10]Gunawan T. S., and Ambikairajah E., "Speech Enhancement using Temporal Masking and Fractional Bark Gammatone Filters", Proceedings of 10th Australian International Conference on Speech Science & Technology, Sydney, pp. 420 - 425, 2004.

[11]Gunawan T.S., Khalifa O.O., and Ambikairajah E., "Forward Masking Threshold Estimation using Neural Networks and its Application to Parallel Speech Enhancement", in International Conference on Computer and Communication Engineering, Vol. 11, No 1, pp. 15 - 26, 2010.

[12]Yi Hu and Philipos C. Loizou, "Incorporating a Psychoacoustic Model in Frequency Domain Speech Enhancement", IEEE signal processing letters, Vol. 11, No. 2, pp. 270 - 273, Feb. 2004.

[13]Eberhard Zwicker and Hugo Fastl, Psychoacoustics, Facts and Models, New York: Springer, 2007.

[14]American National Standard. Methods for the Calculation of the Speech Intelligibility Index. ANSI S3.5-1997, 1997.

[15]"PESQ: An Introduction", Psytechnics Limited, White paper, September 2001.

[16]Rix A. W., Beerends J. G., Hollier M. P., and Hekstra A. P., "Perceptual Evaluation of Speech Quality-A New Method for Speech Quality Assessment of Telephone Networks and Codecs", IEEE International Conference on Acoustics, Speech, and Signal Processing, Vol. 2, pp.749-752, May 2001.

[17]ITU-T P.862, Perceptual evaluation of speech quality (PESQ), and objective method for end-to-end speech quality assessment of narrowband telephone networks and speech codecs, ITU-T Recommendation P.862, 2000.

[18]Ephraim Y. and Malah D., "Speech Enhancement using a Minimum Mean Square Error Short-Time Spectral Amplitude Estimator", IEEE Transaction on Acoustic, Speech, and Signal Processing, ASSP, Vol. 32, pp. 1109 - 1121, December 1984.

[19]Virag N., "Single Channel Speech Enhancement based on Masking Properties of Human Auditory System", IEEE Transaction on Speech and Audio Processing, Vol. 7, pp. 126 - 137, 1999.

[20]Malihe Hassani and Karami Mollaei M. R., "Speech Enhancement based on Spectral Subtraction in Wavelet Domain", Seventh IEEE International Colloquium on Signal Processing and its Applications, pp. 366 - 370, March 2011.

[21]Premananda B. S., and Uma B. V., "Speech Enhancement in Presence of Near-end Noise by Incorporating Auditory Masking in Frequency Domain", International Journal of Scientific & Engineering Research, IJSER, Vol. 5, Issue 11, ISSN 2229-5518, pp. 621-627, Nov. 2014.

[22]Rastislav Telgarsky, "Dominant Frequency Extraction," eprint arXiv: 1306.0103, pp. 1 – 12, June 2013.

[23]Anita Ahmad, Fernando Soares Schlindwein, and Ghulam Andre N., "Comparison of Computation Time for Estimation of Dominant Frequency of Atrial Electrograms: Fast Fourier Transform, Blackman Tukey, Autoregressive and Multiple Signal Classification", in Journal of Biomedical Science and Engineering, JBiSE, pp. 843 – 847, September 2010.

[24]Premananda B. S., Manoj and Uma B. V., "Near-End Perception Enhancement using Dominant Frequency Extraction", International Journal of Advanced Engineering and Research Development (IJAERD), Vol. 1, No. 6, pp. 351-358, June 2014.

International Journal of Image, Graphics and Signal Processing (IJIGSP)