Εμφάνιση απλής εγγραφής

dc.creatorAkritidis L., Fevgas A., Bozanis P., Makris C.en
dc.date.accessioned2023-01-31T07:30:38Z
dc.date.available2023-01-31T07:30:38Z
dc.date.issued2020
dc.identifier10.1007/s10462-020-09807-8
dc.identifier.issn02692821
dc.identifier.urihttp://hdl.handle.net/11615/70358
dc.description.abstractThe continuous growth of the e-commerce industry has rendered the problem of product retrieval particularly important. As more enterprises move their activities on the Web, the volume and the diversity of the product-related information increase quickly. These factors make it difficult for the users to identify and compare the features of their desired products. Recent studies proved that the standard similarity metrics cannot effectively identify identical products, since similar titles often refer to different products and vice-versa. Other studies employ external data sources to enrich the titles; these solutions are rather impractical, since the process of fetching external data is inefficient. In this paper we introduce UPM, an unsupervised algorithm for matching products by their titles that is independent of any external sources. UPM consists of three stages. During the first stage, the algorithm analyzes the titles and extracts combinations of words out of them. These combinations are evaluated in stage 2 according to several criteria, and the most appropriate of them are selected to form the initial clusters. The third phase is a post-processing verification stage that refines the initial clusters by correcting the erroneous matches. This stage is designed to operate in combination with all clustering approaches, especially when the data possess properties that prevent the co-existence of two data points within the same cluster. The experimental evaluation of UPM with multiple datasets demonstrates its superiority against the state-of-the-art clustering approaches and string similarity metrics, in terms of both efficiency and effectiveness. © 2020, Springer Nature B.V.en
dc.language.isoenen
dc.sourceArtificial Intelligence Reviewen
dc.source.urihttps://www.scopus.com/inward/record.uri?eid=2-s2.0-85079724922&doi=10.1007%2fs10462-020-09807-8&partnerID=40&md5=d3e8dd6619dbd3e36dfea8185d71579e
dc.subjectArtificial intelligenceen
dc.subjectUnsupervised learningen
dc.subjectClusteringen
dc.subjectClustering approachen
dc.subjectEntity matchingen
dc.subjectEntity resolutionsen
dc.subjectExperimental evaluationen
dc.subjectExternal data sourcesen
dc.subjectProduct matchingen
dc.subjectUnsupervised algorithmsen
dc.subjectData miningen
dc.subjectSpringeren
dc.titleA self-verifying clustering approach to unsupervised matching of product titlesen
dc.typejournalArticleen


Αρχεία σε αυτό το τεκμήριο

ΑρχείαΜέγεθοςΤύποςΠροβολή

Δεν υπάρχουν αρχεία που να σχετίζονται με αυτό το τεκμήριο.

Αυτό το τεκμήριο εμφανίζεται στις ακόλουθες συλλογές

Εμφάνιση απλής εγγραφής