Auflistung Nach Schlagwort "Automatic speech recognition"

Audio-visual speech recognition incorporating facial depth information captured by the Kinect

Galatas, G.; Potamianos, G.; Makedon, F. (2012)

We investigate the use of facial depth data of a speaking subject, captured by the Kinect device, as an additional speechinformative modality to incorporate to a traditional audiovisual automatic speech recognizer. We ...

Deep View2View Mapping for View-Invariant Lipreading

Koumparoulis A., Potamianos G. (2019)

Recently, visual-only and audio-visual speech recognition have made significant progress thanks to deep-learning based, trainable visual front-ends (VFEs), with most research focusing on frontal or near-frontal face videos. ...

Detecting audio-visual synchrony using deep neural networks

Marcheret E., Potamianos G., Vopicka J., Goel V. (2015)

In this paper, we address the problem of automatically detecting whether the audio and visual speech modalities in frontal pose videos are synchronous or not. This is of interest in a wide range of applications, for example ...

Resource-efficient TDNN Architectures for Audio-visual Speech Recognition

Koumparoulis A., Potamianos G., Thomas S., da Silva Morais E. (2021)

In this paper, we consider the problem of resource-efficient architectures for audio-visual automatic speech recognition (AVSR). Specifically, we complement our earlier work that introduced efficient convolutional neural ...

Robust multi-modal speech recognition in two languages utilizing video and distance information from the kinect

Galatas, G.; Potamianos, G.; Makedon, F. (2013)

We investigate the performance of our audio-visual speech recognition system in both English and Greek under the influence of audio noise. We present the architecture of our recently built system that utilizes information ...