Sfoglia per Autore "Potamianos, G."

Advances in Large Vocabulary Continuous Speech Recognition in Greek: Modeling and nonlinear features

Rodomagoulakis, I.; Potamianos, G.; Maragos, P. (2013)

The main goal of this work is the development of an improved Large Vocabulary Continuous Speech Recognition (LVCSR) framework in Greek. Language modeling is carried out in a collection of journalistic text and in the ...

The Athena-RC system for speech activity detection and speaker localization in the DIRHA smart home

Giannoulis, P.; Tsiami, A.; Rodomagoulakis, I.; Katsamanis, A.; Potamianos, G.; Maragos, P. (2014)

We present our system for speech activity detection and speaker localization inside a smart home with multiple rooms equipped with microphone arrays of known geometry and placement. The smart home is developed as part of ...

ATHENA: A Greek multi-sensory database for home automation control

Tsiami, A.; Rodomagoulakis, I.; Giannoulis, P.; Katsamanis, A.; Potamianos, G.; Maragos, P. (2014)

In this paper we present a Greek speech database with real multi-modal data in a smart home two-room environment. In total, 20 speakers were recorded in 240 one-minute long sessions. The recordings include utterances of ...

Audio-visual speech recognition incorporating facial depth information captured by the Kinect

Galatas, G.; Potamianos, G.; Makedon, F. (2012)

We investigate the use of facial depth data of a speaking subject, captured by the Kinect device, as an additional speechinformative modality to incorporate to a traditional audiovisual automatic speech recognizer. We ...

Audio-visual speech recognition using depth information from the Kinect in noisy video conditions

Galatas, G.; Potamianos, G.; Makedon, F. (2012)

In this paper we build on our recent work, where we successfully incorporated facial depth data of a speaker captured by the Microsoft Kinect device, as a third data stream in an audio-visual automatic speech recognizer. ...

Database and baseline system for detecting degraded traffic signs in urban environments

Floros, G.; Kyritsis, K.; Potamianos, G. (2015)

We present a small database of 'noisy' traffic signs in cluttered urban environments that exhibit various forms of degradation, including vandalism and fading (discoloration). The database contains five types of international ...

Experiments in acoustic source localization using sparse arrays in adverse indoors environments

Tsiami, A.; Katsamanis, A.; Maragos, P.; Potamianos, G. (2014)

In this paper we experiment with 2-D source localization in smart homes under adverse conditions using sparse distributed microphone arrays. We propose some improvements to deal with problems due to high reverberation, ...

Experiments on far-field multichannel speech processing in smart homes

Rodomagoulakis, I.; Giannoulis, P.; Skordilis, Z. I.; Maragos, P.; Potamianos, G. (2013)

In this paper, we examine three problems that rise in the modern, challenging area of far-field speech processing. The developed methods for each problem, namely (a) multichannel speech enhancement, (b) voice activity ...

Multi-microphone fusion for detection of speech and acoustic events in smart spaces

Giannoulis, P.; Potamianos, G.; Katsamanis, A.; Maragos, P. (2014)

In this paper, we examine the challenging problem of detecting acoustic events and voice activity in smart indoors environments, equipped with multiple microphones. In particular, we focus on channel combination strategies, ...

Robust far-field spoken command recognition for home automation combining adaptation and multichannel processing

Katsamanis, A.; Rodomagoulakis, I.; Potamianos, G.; Maragos, P.; Tsiami, A. (2014)

The paper presents our approach to speech-controlled home automation. We are focusing on the detection and recognition of spoken commands preceded by a key-phrase as recorded in a voice-enabled apartment by a set of multiple ...

Robust multi-modal speech recognition in two languages utilizing video and distance information from the kinect

Galatas, G.; Potamianos, G.; Makedon, F. (2013)

We investigate the performance of our audio-visual speech recognition system in both English and Greek under the influence of audio noise. We present the architecture of our recently built system that utilizes information ...