Afficher la notice abrégée

dc.creatorPikramenos G., Smyrnis G., Vernikos I., Konidaris T., Spyrou E., Perantonis S.en
dc.date.accessioned2023-01-31T09:50:07Z
dc.date.available2023-01-31T09:50:07Z
dc.date.issued2020
dc.identifier.isbn9789897583971
dc.identifier.urihttp://hdl.handle.net/11615/78219
dc.description.abstractMonitoring and analysis of human sentiments is currently one of the hottest research topics in the field of human-computer interaction, having many applications. However, in order to become practical in daily life, sentiment recognition techniques should analyze data collected in an unobtrusive way. For this reason, analyzing audio signals of human speech (as opposed to say biometrics) is considered key to potential emotion recognition systems. In this work, we expand upon previous efforts to analyze speech signals using computer vision techniques on their spectrograms. In particular, we utilize ORB descriptors on keypoints distributed on a regular grid over the spectrogram to obtain an intermediate representation. Firstly, a technique similar to Bag-of-Visual-Words (BoVW) is used, where a visual vocabulary is created by clustering keypoint descriptors, but instead a soft candidacy score is used to construct the histogram descriptors of the signal. Furthermore, a technique which takes into account the temporal structure of the spectrograms is examined, allowing for effective model regularization. Both of these techniques are evaluated in several popular emotion recognition datasets, with results indicating an improvement over the simple BoVW method. Copyright © 2020 by SCITEPRESS – Science and Technology Publications, Lda. All rights reserved.en
dc.language.isoenen
dc.sourceICPRAM 2020 - Proceedings of the 9th International Conference on Pattern Recognition Applications and Methodsen
dc.source.urihttps://www.scopus.com/inward/record.uri?eid=2-s2.0-85082991209&partnerID=40&md5=f3d5bcafa39a4b500acb0d517fc492f0
dc.subjectAudio systemsen
dc.subjectHuman computer interactionen
dc.subjectSentiment analysisen
dc.subjectSpectrographsen
dc.subjectSpeech analysisen
dc.subjectSpeech communicationen
dc.subjectBag-of-visual-wordsen
dc.subjectComputer vision techniquesen
dc.subjectEmotion recognitionen
dc.subjectIntermediate representationsen
dc.subjectMonitoring and analysisen
dc.subjectResearch topicsen
dc.subjectTemporal structuresen
dc.subjectVisual vocabulariesen
dc.subjectSpeech recognitionen
dc.subjectSciTePressen
dc.titleSentiment analysis from sound spectrograms via soft BOVW and temporal structure modellingen
dc.typeconferenceItemen


Fichier(s) constituant ce document

FichiersTailleFormatVue

Il n'y a pas de fichiers associés à ce document.

Ce document figure dans la(les) collection(s) suivante(s)

Afficher la notice abrégée