Mostra i principali dati dell'item

dc.creatorAkritidis L., Bozanis P.en
dc.date.accessioned2023-01-31T07:30:37Z
dc.date.available2023-01-31T07:30:37Z
dc.date.issued2018
dc.identifier10.1109/INISTA.2018.8466294
dc.identifier.isbn9781538651506
dc.identifier.urihttp://hdl.handle.net/11615/70352
dc.description.abstractThe problem of matching product titles is of particular interest for both users and marketers. The former, frequently search the Web with the aim of comparing prices and characteristics, or obtaining and aggregating information provided by other users. The latter, often require wide knowledge of competitive policies, prices and features to organize a promotional campaign about a group of products. To address this interesting problem, recent studies have attempted to enrich the product titles by exploiting Web search engines. More specifically, these methods suggest that for each product title a query should be submitted. After the results have been collected, the most important words which appear in the results are identified and appended in the titles. In the sequel, each word is assigned an importance score and finally, a similarity measure is applied to identify if two or more titles refer to the same product. Nonetheless, these methods have multiple problems including scalability, slow retrieval of the required additional search results, and lack of flexibility. In this paper, we present a different approach which addresses all these issues and is based on the morphological analysis of the titles of the products. In particular, our method operates in two phases. In the first phase, we compute the combinations of the words of the titles and we record several statistics such as word proximity and frequency values. In the second phase, we use this information to assign a score to each combination. The highest scoring combination is then declared as label of the cluster which contains each product. The experimental evaluation of the algorithm, in a real world dataset, demonstrated that compared to three popular string similarity metrics, our approach achieves up to 36% better matching performance and at least 13 times faster execution. © 2018 IEEE.en
dc.language.isoenen
dc.source2018 IEEE (SMC) International Conference on Innovations in Intelligent Systems and Applications, INISTA 2018en
dc.source.urihttps://www.scopus.com/inward/record.uri?eid=2-s2.0-85055473467&doi=10.1109%2fINISTA.2018.8466294&partnerID=40&md5=717826b6aaecb27ecdcccc14e8eea38d
dc.subjectAlgorithmsen
dc.subjectCostsen
dc.subjectIntelligent systemsen
dc.subjectSearch enginesen
dc.subjectUnsupervised learningen
dc.subjectEntity matchingen
dc.subjectExperimental evaluationen
dc.subjectMatching performanceen
dc.subjectMorphological analysisen
dc.subjectproducts matchingen
dc.subjectPromotional campaignen
dc.subjectSimilarity measureen
dc.subjectString similarityen
dc.subjectData miningen
dc.subjectInstitute of Electrical and Electronics Engineers Inc.en
dc.titleEffective Unsupervised Matching of Product Titles with k-Combinations and Permutationsen
dc.typeconferenceItemen


Files in questo item

FilesDimensioneFormatoMostra

Nessun files in questo item.

Questo item appare nelle seguenti collezioni

Mostra i principali dati dell'item