Listar por tema "European Signal Processing Conference, EUSIPCO"
Mostrando ítems 1-7 de 7
-
An Audiovisual Child Emotion Recognition System for Child-Robot Interaction Applications
(2021)We present an audiovisual emotion recognition system tailored to child-robot interaction scenarios. Our proposed system is based on deep learning and the Temporal Segment Networks framework, receives input from both the ... -
Fingerspelled alphabet sign recognition in upper-body videos
(2019)Fingerspelling is a crucial part of sign-based communication, however its recognition remains a challenging and mostly overlooked computer vision problem. To address it, this paper presents a system that recognizes the 24 ... -
A fully convolutional sequence learning approach for cued speech recognition from videos
(2021)Cued Speech constitutes a sign-based communication variant for the speech and hearing impaired, which involves visual information from lip movements combined with hand positional and gestural cues. In this paper, we consider ... -
H-V shadow detection based on electromagnetism-like optimization
(2021)Shadow detection is useful in a variety of image analysis applications, as it can improve scene understanding. Most of the recent shadow detection approaches use near-infrared (NIR) cameras and deep learning to provide ... -
Multi-channel non-negative matrix factorization for overlapped acoustic event detection
(2018)In this paper, we propose two multi-channel extensions of non-negative matrix factorization (NMF) for acoustic event detection. The first method performs decision fusion on the activation matrices produced from independent ... -
Overlapped Sound Event Classification via Multi-Channel Sound Separation Network
(2021)Overlapped sound event classification (SEC) can be a challenging task, especially in scenarios where the number of possible event classes or the number of simultaneous events occurring (polyphony level) are large. In such ... -
Resource-efficient TDNN Architectures for Audio-visual Speech Recognition
(2021)In this paper, we consider the problem of resource-efficient architectures for audio-visual automatic speech recognition (AVSR). Specifically, we complement our earlier work that introduced efficient convolutional neural ...