Sentiment analysis from sound spectrograms via soft BOVW and temporal structure modelling

Pikramenos G., Smyrnis G., Vernikos I., Konidaris T., Spyrou E., Perantonis S.

dc.creator	Pikramenos G., Smyrnis G., Vernikos I., Konidaris T., Spyrou E., Perantonis S.	en
dc.date.accessioned	2023-01-31T09:50:07Z
dc.date.available	2023-01-31T09:50:07Z
dc.date.issued	2020
dc.identifier.isbn	9789897583971
dc.identifier.uri	http://hdl.handle.net/11615/78219
dc.description.abstract	Monitoring and analysis of human sentiments is currently one of the hottest research topics in the field of human-computer interaction, having many applications. However, in order to become practical in daily life, sentiment recognition techniques should analyze data collected in an unobtrusive way. For this reason, analyzing audio signals of human speech (as opposed to say biometrics) is considered key to potential emotion recognition systems. In this work, we expand upon previous efforts to analyze speech signals using computer vision techniques on their spectrograms. In particular, we utilize ORB descriptors on keypoints distributed on a regular grid over the spectrogram to obtain an intermediate representation. Firstly, a technique similar to Bag-of-Visual-Words (BoVW) is used, where a visual vocabulary is created by clustering keypoint descriptors, but instead a soft candidacy score is used to construct the histogram descriptors of the signal. Furthermore, a technique which takes into account the temporal structure of the spectrograms is examined, allowing for effective model regularization. Both of these techniques are evaluated in several popular emotion recognition datasets, with results indicating an improvement over the simple BoVW method. Copyright © 2020 by SCITEPRESS – Science and Technology Publications, Lda. All rights reserved.	en
dc.language.iso	en	en
dc.source	ICPRAM 2020 - Proceedings of the 9th International Conference on Pattern Recognition Applications and Methods	en
dc.source.uri	https://www.scopus.com/inward/record.uri?eid=2-s2.0-85082991209&partnerID=40&md5=f3d5bcafa39a4b500acb0d517fc492f0
dc.subject	Audio systems	en
dc.subject	Human computer interaction	en
dc.subject	Sentiment analysis	en
dc.subject	Spectrographs	en
dc.subject	Speech analysis	en
dc.subject	Speech communication	en
dc.subject	Bag-of-visual-words	en
dc.subject	Computer vision techniques	en
dc.subject	Emotion recognition	en
dc.subject	Intermediate representations	en
dc.subject	Monitoring and analysis	en
dc.subject	Research topics	en
dc.subject	Temporal structures	en
dc.subject	Visual vocabularies	en
dc.subject	Speech recognition	en
dc.subject	SciTePress	en
dc.title	Sentiment analysis from sound spectrograms via soft BOVW and temporal structure modelling	en
dc.type	conferenceItem	en

Fichier(s) constituant ce document

Fichiers	Taille	Format	Vue
Il n'y a pas de fichiers associés à ce document.

Ce document figure dans la(les) collection(s) suivante(s)

Δημοσιεύσεις σε περιοδικά, συνέδρια, κεφάλαια βιβλίων κλπ. [19705]

Afficher la notice abrégée