The Athena-RC system for speech activity detection and speaker localization in the DIRHA smart home

Giannoulis, P.; Tsiami, A.; Rodomagoulakis, I.; Katsamanis, A.; Potamianos, G.; Maragos, P.

Résumé

We present our system for speech activity detection and speaker localization inside a smart home with multiple rooms equipped with microphone arrays of known geometry and placement. The smart home is developed as part of the DIRHA European funded project, providing both simulated and real data for system development and evaluation, under extremely challenging conditions of noise, reverberation, and speech overlap. Our proposed approach performs speech activity detection first, by employing multi-microphone decision fusion on traditional statistical models and acoustic features, within a Viterbi decoding framework, further assisted by signal energy-and model log-likelihood threshold-based heuristics. Then it performs speaker localization using traditional time-difference of arrival estimation between properly selected microphone pairs, further assisted by a dereverberation component. The system achieves very low detection errors, namely less than 4% (5%) for speech activity detection in the simulated (real) DIRHA corpus, and less than 10% (12%) for joint speech detection and speaker localization. © 2014 IEEE.

URI

http://hdl.handle.net/11615/27943

Collections

Δημοσιεύσεις σε περιοδικά, συνέδρια, κεφάλαια βιβλίων κλπ. [19735]