In-memory k nearest neighbor GPU-based query processing

The k Nearest Neighbor (k-NN) algorithm is widely used for classification in several application domains (medicine, economy, entertainment, etc.). Let a group of query points, for each of which we need to compute the k-NNs within a reference dataset to derive the dominating feature class. When the reference points volume is extremely big, it can be proved challenging to deliver low latency results. Furthermore, when the query points are originating from streams, the need for new methods arises to address the computational overhead. We propose and implement two in-memory GPU-based algorithms for the k-NN query, using the CUDA API and the Thrust library. The first one is based on a Brute Force approach and the second one is using heuristics to minimize the reference points near a query point. We also present an extensive experimental comparison against existing algorithms, using synthetic and real datasets. The results show that both of our algorithms outperform these algorithms, in terms of execution time as well as total volume of in-memory reference points that can be handled. © 2020 by SCITEPRESS - Science and Technology Publications, Lda.All rights reserved.

URI

http://hdl.handle.net/11615/80564

Collections

Δημοσιεύσεις σε περιοδικά, συνέδρια, κεφάλαια βιβλίων κλπ. [19705]