SliceNBound: Solving closest pairs and distance join queries in apache spark
Data
2017Language
en
Soggetto
Abstract
The (K) Closest-Pair(s) Query, KCPQ, consists in finding the (K) closest pair(s) of objects between two spatial datasets. Recently, several systems that enhance Apache Spark with spatial-awareness have been presented, providing a variety of queries for spatial computation, but not the KCPQ. Since queries are of different nature and one processing technique does not fit all cases, we need specialized algorithms for specific queries that exploit the power provided by parallel systems such as Apache Spark. This paper addresses the problem of answering the KCPQ in Apache Spark, by presenting such a specialized, fast algorithm that can easily be imported in any, spatial-oriented or general, Spark-based system. Furthermore, it presents a variant of this algorithm that solves the Distance Join Query. Experiments and comparison to other solutions indicate that our method is fast and efficient. © 2017, Springer International Publishing AG.
Collections
Related items
Showing items related by title, author, creator and subject.
-
Extended OQL for object oriented parallel query processing
Fountoukis, S. G.; Bekakos, M. P. (2007)Herein, an extension to the object query language (OQL) for incorporating binary relational expressions is investigated. The extended query language is suitable for query submissions to an object oriented database, whose ... -
The K group nearest-neighbor query on non-indexed RAM-resident data
Roumelis G., Vassilakopoulos M., Corral A., Manolopoulos Y. (2016)Data sets that are used for answering a single query only once (or just a few times) before they are replaced by new data sets appear frequently in practical applications. The cost of buiding indexes to accelerate query ... -
Nearest Neighbor Algorithms using xBR-Trees
Roumelis, G.; Vassilakopoulos, M.; Corral, A. (2011)One of the common queries in spatial databases is the (K) Nearest Neighbor Query that discovers the (K) closest objects to a query object. Processing of spatial queries, in most cases, is accomplished by indexing spatial ...