Low-frequency Compensated Synthetic Impulse Responses for Improved Far-field Speech Recognition


Abstract

We propose a method for generating low-frequency compensated synthetic impulse responses that improve the performance of far-field speech recognition systems trained on artificially augmented datasets. We design linear-phase filters that adapt the simulated impulse responses to equalization distributions corresponding to real-world captured impulse responses. Our filtered synthetic impulse responses are then used to augment clean speech data from LibriSpeech dataset [1]. We evaluate the performance of our method on the real-world LibriSpeech test set. In practice, our low-frequency compensated synthetic dataset can reduce the word-error-rate by up to 8.8% for far-field speech recognition.

Paper

Low-frequency Compensated Synthetic Impulse Responses for Improved Far-field Speech Recognition, ICASSP 2020.
Zhenyu Tang, Hsien-Yu Meng, and Dinesh Manocha

@inproceedings{9054454,  
  author={Z. {Tang} and H. {Meng} and D. {Manocha}},  
  booktitle={ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},  
  title={Low-Frequency Compensated Synthetic Impulse Responses For Improved Far-Field Speech Recognition},   
  year={2020},  
  volume={},  
  number={},  
  pages={6974-6978},
}

Code

Git repo