Abstract
We present an efficient and realistic geometric sound simulation approach for generating and augmenting training data in speech-related machine learning tasks. Our physically based acoustic simulation method is capable of modeling occlusion, specular and diffuse reflections of sound in complicated acoustic environments, whereas the classical image method can only model specular reflections in simple room settings. We show that by using our synthetic training data, the same models gain significant performance improvement on real test sets in both speech recognition and keyword spotting tasks, without fine tuning using any real data.
Paper
Improving Reverberant Speech Training Using Diffuse Acoustic Simulation
, ICASSP 2020.
Zhenyu Tang, Lianwu Chen, Bo Wu, Dong Yu, and Dinesh Manocha
@inproceedings{9052932,
author={Z. {Tang} and L. {Chen} and B. {Wu} and D. {Yu} and D. {Manocha}},
booktitle={ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
title={Improving Reverberant Speech Training Using Diffuse Acoustic Simulation},
year={2020},
volume={},
number={},
pages={6969-6973},
}
Data
RIRs generated using geometric sound propagation can be downloaded here.