A Deep Learning Approach to Object Affordance Segmentation

Thermos S., Daras P., Potamianos G.

dc.creator	Thermos S., Daras P., Potamianos G.	en
dc.date.accessioned	2023-01-31T10:08:11Z
dc.date.available	2023-01-31T10:08:11Z
dc.date.issued	2020
dc.identifier	10.1109/ICASSP40776.2020.9054167
dc.identifier.isbn	9781509066315
dc.identifier.issn	15206149
dc.identifier.uri	http://hdl.handle.net/11615/79695
dc.description.abstract	Learning to understand and infer object functionalities is an important step towards robust visual intelligence. Significant research efforts have recently focused on segmenting the object parts that enable specific types of human-object interaction, the so-called object affordances. However, most works treat it as a static semantic segmentation problem, focusing solely on object appearance and relying on strong supervision and object detection. In this paper, we propose a novel approach that exploits the spatio-temporal nature of human-object interaction for affordance segmentation. In particular, we design an autoencoder that is trained using ground-truth labels of only the last frame of the sequence, and is able to infer pixel-wise affordance labels in both videos and static images. Our model surpasses the need for object labels and bounding boxes by using a soft-attention mechanism that enables the implicit localization of the interaction hotspot. For evaluation purposes, we introduce the SOR3D-AFF corpus, which consists of human-object interaction sequences and supports 9 types of affordances in terms of pixel-wise annotation, covering typical manipulations of tool-like objects. We show that our model achieves competitive results compared to strongly supervised methods on SOR3D-AFF, while being able to predict affordances for similar unseen objects in two affordance image-only datasets. © 2020 IEEE.	en
dc.language.iso	en	en
dc.source	ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings	en
dc.source.uri	https://www.scopus.com/inward/record.uri?eid=2-s2.0-85089238483&doi=10.1109%2fICASSP40776.2020.9054167&partnerID=40&md5=8daf6328e04fe130a2ed5a45b05118e7
dc.subject	Audio signal processing	en
dc.subject	Image segmentation	en
dc.subject	Object detection	en
dc.subject	Pixels	en
dc.subject	Semantics	en
dc.subject	Speech communication	en
dc.subject	Attention mechanisms	en
dc.subject	Human-object interaction	en
dc.subject	Learning approach	en
dc.subject	Object appearance	en
dc.subject	Research efforts	en
dc.subject	Static semantics	en
dc.subject	Supervised methods	en
dc.subject	Visual intelligence	en
dc.subject	Deep learning	en
dc.subject	Institute of Electrical and Electronics Engineers Inc.	en
dc.title	A Deep Learning Approach to Object Affordance Segmentation	en
dc.type	conferenceItem	en

Αρχεία σε αυτό το τεκμήριο

Αρχεία	Μέγεθος	Τύπος	Προβολή
Δεν υπάρχουν αρχεία που να σχετίζονται με αυτό το τεκμήριο.

Αυτό το τεκμήριο εμφανίζεται στις ακόλουθες συλλογές

Δημοσιεύσεις σε περιοδικά, συνέδρια, κεφάλαια βιβλίων κλπ. [19735]

Εμφάνιση απλής εγγραφής