Mostrar el registro sencillo del ítem
Room-localized speech activity detection in multi-microphone smart homes
dc.creator | Giannoulis P., Potamianos G., Maragos P. | en |
dc.date.accessioned | 2023-01-31T07:42:13Z | |
dc.date.available | 2023-01-31T07:42:13Z | |
dc.date.issued | 2019 | |
dc.identifier | 10.1186/s13636-019-0158-8 | |
dc.identifier.issn | 16874714 | |
dc.identifier.uri | http://hdl.handle.net/11615/72375 | |
dc.description.abstract | Voice-enabled interaction systems in domestic environments have attracted significant interest recently, being the focus of smart home research projects and commercial voice assistant home devices. Within the multi-module pipelines of such systems, speech activity detection (SAD) constitutes a crucial component, providing input to their activation and speech recognition subsystems. In typical multi-room domestic environments, SAD may also convey spatial intelligence to the interaction, in addition to its traditional temporal segmentation output, by assigning speech activity at the room level. Such room-localized SAD can, for example, disambiguate user command referents, allow localized system feedback, and enable parallel voice interaction sessions by multiple subjects in different rooms. In this paper, we investigate a room-localized SAD system for smart homes equipped with multiple microphones distributed in multiple rooms, significantly extending our earlier work. The system employs a two-stage algorithm, incorporating a set of hand-crafted features specially designed to discriminate room-inside vs. room-outside speech at its second stage, refining SAD hypotheses obtained at its first stage by traditional statistical modeling and acoustic front-end processing. Both algorithmic stages exploit multi-microphone information, combining it at the signal, feature, or decision level. The proposed approach is extensively evaluated on both simulated and real data recorded in a multi-room, multi-microphone smart home, significantly outperforming alternative baselines. Further, it remains robust to reduced microphone setups, while also comparing favorably to deep learning-based alternatives. © 2019, The Author(s). | en |
dc.language.iso | en | en |
dc.source | Eurasip Journal on Audio, Speech, and Music Processing | en |
dc.source.uri | https://www.scopus.com/inward/record.uri?eid=2-s2.0-85071630220&doi=10.1186%2fs13636-019-0158-8&partnerID=40&md5=57b484838f86afd53c59ec2bf9e83a70 | |
dc.subject | Automation | en |
dc.subject | Deep learning | en |
dc.subject | Intelligent buildings | en |
dc.subject | Microphones | en |
dc.subject | Refining | en |
dc.subject | Speech | en |
dc.subject | Active room selection | en |
dc.subject | Microphone arrays | en |
dc.subject | Multi channel | en |
dc.subject | Smart homes | en |
dc.subject | Speech activity detections | en |
dc.subject | Speech recognition | en |
dc.subject | Springer International Publishing | en |
dc.title | Room-localized speech activity detection in multi-microphone smart homes | en |
dc.type | journalArticle | en |
Ficheros en el ítem
Ficheros | Tamaño | Formato | Ver |
---|---|---|---|
No hay ficheros asociados a este ítem. |