dc.creator | Akritidis L., Alamaniotis M., Fevgas A., Tsompanopoulou P., Bozanis P. | en |
dc.date.accessioned | 2023-01-31T07:30:36Z | |
dc.date.available | 2023-01-31T07:30:36Z | |
dc.date.issued | 2022 | |
dc.identifier | 10.1142/S0218213022500348 | |
dc.identifier.issn | 02182130 | |
dc.identifier.uri | http://hdl.handle.net/11615/70351 | |
dc.description.abstract | This paper focuses on the popular problem of short text clustering. Since the short text documents typically exhibit high degrees of data sparseness and dimensionality, the problem in question is generally considered more challenging than the traditional clustering scenarios. Our proposed solution, named VEPH, is based on a novel algorithm that was published recently with the aim of optimally clustering short text documents. VEPH includes two stages: During the first stage, the original text vectors are projected on a lower dimensional space and the documents with projection vectors lying on the same dimensional space are grouped in the same cluster. The second stage is a refinement process which attempts to improve the quality of the clusters that were generated during the previous stage. The quality of a cluster is determined by its homogeneity and completeness and these are the two primary design criteria of this stage. Initially VEPH cleanses the clusters by removing all dissimilar elements, and then, it iteratively merges the similar clusters in a hierarchical agglomerative manner. The proposed algorithm has been experimentally evaluated in terms of F1 and NMI, by employing three datasets with diverse attributes. The results demonstrated its superiority over other state-of-the-art works of the relevant literature. © 2022 World Scientific Publishing Company. | en |
dc.language.iso | en | en |
dc.source | International Journal on Artificial Intelligence Tools | en |
dc.source.uri | https://www.scopus.com/inward/record.uri?eid=2-s2.0-85136201674&doi=10.1142%2fS0218213022500348&partnerID=40&md5=925e43ceb1900432afdde91235ebb3ba | |
dc.subject | Cluster analysis | en |
dc.subject | Iterative methods | en |
dc.subject | Clusterings | en |
dc.subject | Data dimensionality | en |
dc.subject | Data sparseness | en |
dc.subject | Feature learning | en |
dc.subject | Machine-learning | en |
dc.subject | Short text clustering | en |
dc.subject | Short texts | en |
dc.subject | Short-text documents | en |
dc.subject | Text Clustering | en |
dc.subject | Traditional clustering | en |
dc.subject | Vector spaces | en |
dc.subject | World Scientific | en |
dc.title | Improving Hierarchical Short Text Clustering through Dominant Feature Learning | en |
dc.type | journalArticle | en |