Εμφάνιση απλής εγγραφής

dc.creatorMoutafis P., García-García F., Mavrommatis G., Vassilakopoulos M., Corral A., Iribarne L.en
dc.date.accessioned2023-01-31T09:02:13Z
dc.date.available2023-01-31T09:02:13Z
dc.date.issued2021
dc.identifier10.1007/s10619-020-07317-8
dc.identifier.issn09268782
dc.identifier.urihttp://hdl.handle.net/11615/76815
dc.description.abstractGiven two datasets of points (called Query and Training), the Group (K) Nearest-Neighbor (GKNN) query retrieves (K) points of the Training with the smallest sum of distances to every point of the Query. This spatial query has been studied during the recent years and several performance improving techniques and pruning heuristics have been proposed. In previous work, we presented the first MapReduce algorithm, consisting of alternating local and parallel phases, which can be used to effectively process the GKNN query when the Query fits in memory, while the Training one belongs to the Big Data category. In this paper, we present a significantly improved algorithm that incorporates a new high-performance refining method, a fast way to calculate distance sums for pruning purposes and several other minor coding and algorithmic improvements. Moreover, we transform this algorithm (which has been implemented in the Hadoop framework) to SpatialHadoop (a popular distributed framework that is dedicated to spatial processing), using a novel two-level partitioning method. Using real world and synthetic datasets, we also present a thorough experimental study of the Hadoop and SpatialHadoop versions of the algorithm, including a backstage analysis of the algorithm’s performance, using metrics that highlight its internal functioning. Finally, we present an experimental comparison of the Hadoop, the SpatialHadoop versions and the version of our previous work, showing that the improved versions are the big winners, with the SpatialHadoop one being faster than its Hadoop counterpart. © 2020, Springer Science+Business Media, LLC, part of Springer Nature.en
dc.language.isoenen
dc.sourceDistributed and Parallel Databasesen
dc.source.urihttps://www.scopus.com/inward/record.uri?eid=2-s2.0-85095711107&doi=10.1007%2fs10619-020-07317-8&partnerID=40&md5=8bb94a248b662bdd254622330d7975fc
dc.subjectInformation systemsen
dc.subjectSoftware engineeringen
dc.subjectDistributed frameworken
dc.subjectExperimental comparisonen
dc.subjectImproving techniquesen
dc.subjectInternal functioningen
dc.subjectK nearest neighbor queriesen
dc.subjectPartitioning methodsen
dc.subjectSpatial processingen
dc.subjectSynthetic datasetsen
dc.subjectNearest neighbor searchen
dc.subjectSpringeren
dc.titleAlgorithms for processing the group K nearest-neighbor query on distributed frameworksen
dc.typejournalArticleen


Αρχεία σε αυτό το τεκμήριο

ΑρχείαΜέγεθοςΤύποςΠροβολή

Δεν υπάρχουν αρχεία που να σχετίζονται με αυτό το τεκμήριο.

Αυτό το τεκμήριο εμφανίζεται στις ακόλουθες συλλογές

Εμφάνιση απλής εγγραφής