Browsing by Subject "Audio-visual"

Audio-visual speech recognition using depth information from the Kinect in noisy video conditions

Galatas, G.; Potamianos, G.; Makedon, F. (2012)

In this paper we build on our recent work, where we successfully incorporated facial depth data of a speaker captured by the Microsoft Kinect device, as a third data stream in an audio-visual automatic speech recognizer. ...

ChildBot: Multi-robot perception and interaction with children

Efthymiou N., Filntisis P.P., Koutras P., Tsiami A., Hadfield J., Potamianos G., Maragos P. (2022)

In this paper, we present an integrated robotic system capable of participating in and performing a wide range of educational and entertainment tasks collaborating with one or more children. The system, called ChildBot, ...

Detecting audio-visual synchrony using deep neural networks

Marcheret E., Potamianos G., Vopicka J., Goel V. (2015)

In this paper, we address the problem of automatically detecting whether the audio and visual speech modalities in frontal pose videos are synchronous or not. This is of interest in a wide range of applications, for example ...

Resource-efficient TDNN Architectures for Audio-visual Speech Recognition

Koumparoulis A., Potamianos G., Thomas S., da Silva Morais E. (2021)

In this paper, we consider the problem of resource-efficient architectures for audio-visual automatic speech recognition (AVSR). Specifically, we complement our earlier work that introduced efficient convolutional neural ...

Scattering vs. Discrete Cosine Transform Features in Visual Speech Processing

Marcheret E., Potamianos G., Vopicka J., Goel V. (2015)

Appearance-based feature extraction constitutes the dominant approach for visual speech representation in a variety of problems, such as automatic speechreading, visual speech detection, and others. To obtain the necessary ...