Logo
    • English
    • Ελληνικά
    • Deutsch
    • français
    • italiano
    • español
  • Ελληνικά 
    • English
    • Ελληνικά
    • Deutsch
    • français
    • italiano
    • español
  • Σύνδεση
Προβολή τεκμηρίου 
  •   Ιδρυματικό Αποθετήριο Πανεπιστημίου Θεσσαλίας
  • Επιστημονικές Δημοσιεύσεις Μελών ΠΘ (ΕΔΠΘ)
  • Δημοσιεύσεις σε περιοδικά, συνέδρια, κεφάλαια βιβλίων κλπ.
  • Προβολή τεκμηρίου
  •   Ιδρυματικό Αποθετήριο Πανεπιστημίου Θεσσαλίας
  • Επιστημονικές Δημοσιεύσεις Μελών ΠΘ (ΕΔΠΘ)
  • Δημοσιεύσεις σε περιοδικά, συνέδρια, κεφάλαια βιβλίων κλπ.
  • Προβολή τεκμηρίου
JavaScript is disabled for your browser. Some features of this site may not work without it.
Ιδρυματικό Αποθετήριο Πανεπιστημίου Θεσσαλίας
Όλο το DSpace
  • Κοινότητες & Συλλογές
  • Ανά ημερομηνία δημοσίευσης
  • Συγγραφείς
  • Τίτλοι
  • Λέξεις κλειδιά

A fully convolutional sequence learning approach for cued speech recognition from videos

Thumbnail
Συγγραφέας
Papadimitriou K., Potamianos G.
Ημερομηνία
2021
Γλώσσα
en
DOI
10.23919/Eusipco47968.2020.9287365
Λέξη-κλειδί
Audition
Convolution
Convolutional neural networks
Decoding
Deep learning
Signal processing
Speech
Speech communication
Visual communication
Automatic recognition
Block structures
British English
Convolutional decoders
Convolutional encoders
Hearing impaired
Sequence learning
Visual information
Speech recognition
European Signal Processing Conference, EUSIPCO
Εμφάνιση Μεταδεδομένων
Επιτομή
Cued Speech constitutes a sign-based communication variant for the speech and hearing impaired, which involves visual information from lip movements combined with hand positional and gestural cues. In this paper, we consider its automatic recognition in videos, introducing a deep sequence learning approach that consists of two separately trained components: an image learner based on convolutional neural networks (CNNs) and a fully convolutional encoder-decoder. Specifically, handshape and lip visual features extracted from a 3D-CNN feature learner, as well as hand position embeddings obtained by a 2D-CNN, are concatenated and fed to a time-depth separable (TDS) block structure, followed by a multi-step attention-based convolutional decoder for phoneme prediction. To our knowledge, this is the first work where recognition of cued speech is addressed using a common modeling approach based entirely on CNNs. The introduced model is evaluated on a French and a British English cued speech dataset in terms of phoneme error rate, and it is shown to significantly outperform alternative modeling approaches. © 2021 European Signal Processing Conference, EUSIPCO. All rights reserved.
URI
http://hdl.handle.net/11615/77585
Collections
  • Δημοσιεύσεις σε περιοδικά, συνέδρια, κεφάλαια βιβλίων κλπ. [19735]

Related items

Showing items related by title, author, creator and subject.

  • Thumbnail

    Multimodal fusion and sequence learning for cued speech recognition from videos 

    Papadimitriou K., Parelli M., Sapountzaki G., Pavlakos G., Maragos P., Potamianos G. (2021)
    Cued Speech (CS) constitutes a non-vocal mode of communication that relies on lip movements in conjunction with hand positional and gestural cues, in order to disambiguate phonetic information and make it accessible to the ...
  • Thumbnail

    SPATIO-TEMPORAL GRAPH CONVOLUTIONAL NETWORKS FOR CONTINUOUS SIGN LANGUAGE RECOGNITION 

    Parelli M., Papadimitriou K., Potamianos G., Pavlakos G., Maragos P. (2022)
    We address the challenging problem of continuous sign language recognition (CSLR) from RGB videos, proposing a novel deep-learning framework that employs spatio-temporal graph convolutional networks (ST-GCNs), which operate ...
  • Thumbnail

    Look-behind fully convolutional neural network for computer-aided endoscopy 

    Diamantis D.E., Iakovidis D.K., Koulaouzidis A. (2019)
    In this paper, we propose a novel Fully Convolutional Neural Network (FCN) architecture aiming to aid the detection of abnormalities, such as polyps, ulcers and blood, in gastrointestinal (GI) endoscopy images. The proposed ...
htmlmap 

 

Πλοήγηση

Όλο το DSpaceΚοινότητες & ΣυλλογέςΑνά ημερομηνία δημοσίευσηςΣυγγραφείςΤίτλοιΛέξεις κλειδιάΑυτή η συλλογήΑνά ημερομηνία δημοσίευσηςΣυγγραφείςΤίτλοιΛέξεις κλειδιά

Ο λογαριασμός μου

ΣύνδεσηΕγγραφή (MyDSpace)
Πληροφορίες-Επικοινωνία
ΑπόθεσηΣχετικά μεΒοήθειαΕπικοινωνήστε μαζί μας
Επιλογή ΓλώσσαςΌλο το DSpace
EnglishΕλληνικά
htmlmap