Sfoglia per Soggetto "Audio-visual automatic speech recognition"

Items 1-3 di 3

Audio-visual speech recognition incorporating facial depth information captured by the Kinect

Galatas, G.; Potamianos, G.; Makedon, F. (2012)

We investigate the use of facial depth data of a speaking subject, captured by the Kinect device, as an additional speechinformative modality to incorporate to a traditional audiovisual automatic speech recognizer. We ...
Resource-efficient TDNN Architectures for Audio-visual Speech Recognition

Koumparoulis A., Potamianos G., Thomas S., da Silva Morais E. (2021)

In this paper, we consider the problem of resource-efficient architectures for audio-visual automatic speech recognition (AVSR). Specifically, we complement our earlier work that introduced efficient convolutional neural ...
Robust multi-modal speech recognition in two languages utilizing video and distance information from the kinect

Galatas, G.; Potamianos, G.; Makedon, F. (2013)

We investigate the performance of our audio-visual speech recognition system in both English and Greek under the influence of audio noise. We present the architecture of our recently built system that utilizes information ...