Multimodal sign language recognition via temporal deformable convolutional sequence learning

Papadimitriou K., Potamianos G.

dc.creator	Papadimitriou K., Potamianos G.	en
dc.date.accessioned	2023-01-31T09:42:21Z
dc.date.available	2023-01-31T09:42:21Z
dc.date.issued	2020
dc.identifier	10.21437/Interspeech.2020-2691
dc.identifier.issn	2308457X
dc.identifier.uri	http://hdl.handle.net/11615/77586
dc.description.abstract	In this paper we address the challenging problem of sign language recognition (SLR) from videos, introducing an end-to-end deep learning approach that relies on the fusion of a number of spatio-temporal feature streams, as well as a fully convolutional encoder-decoder for prediction. Specifically, we examine the contribution of optical flow, human skeletal features, as well as appearance features of handshapes and mouthing, in conjunction with a temporal deformable convolutional attention-based encoder-decoder for SLR. To our knowledge, this is the first use in this task of a fully convolutional multi-step attention-based encoder-decoder employing temporal deformable convolutional block structures. We conduct experiments on three sign language datasets and compare our approach to existing state-of-the-art SLR methods, demonstrating its superiority. © 2020 ISCA	en
dc.language.iso	en	en
dc.source	Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH	en
dc.source.uri	https://www.scopus.com/inward/record.uri?eid=2-s2.0-85098115814&doi=10.21437%2fInterspeech.2020-2691&partnerID=40&md5=e02589f0f9e04137218118f5947adef4
dc.subject	Computer hardware description languages	en
dc.subject	Decoding	en
dc.subject	Deep learning	en
dc.subject	Deformation	en
dc.subject	Optical flows	en
dc.subject	Signal encoding	en
dc.subject	Speech communication	en
dc.subject	Block structures	en
dc.subject	Convolutional encoders	en
dc.subject	Encoder-decoder	en
dc.subject	Learning approach	en
dc.subject	Sequence learning	en
dc.subject	Sign Language recognition	en
dc.subject	Spatio temporal features	en
dc.subject	State of the art	en
dc.subject	Convolution	en
dc.subject	International Speech Communication Association	en
dc.title	Multimodal sign language recognition via temporal deformable convolutional sequence learning	en
dc.type	conferenceItem	en

Files in this item

Files	Size	Format	View
There are no files associated with this item.

This item appears in the following Collection(s)

Δημοσιεύσεις σε περιοδικά, συνέδρια, κεφάλαια βιβλίων κλπ. [19705]

Show simple item record