PANENE: A Progressive Algorithm for Indexing and Querying Approximate k-Nearest Neighbors
- PMID: 30222575
- DOI: 10.1109/TVCG.2018.2869149
PANENE: A Progressive Algorithm for Indexing and Querying Approximate k-Nearest Neighbors
Abstract
We present PANENE, a progressive algorithm for approximate nearest neighbor indexing and querying. Although the use of k-nearest neighbor (KNN) libraries is common in many data analysis methods, most KNN algorithms can only be queried when the whole dataset has been indexed, i.e., they are not online. Even the few online implementations are not progressive in the sense that the time to index incoming data is not bounded and cannot satisfy the latency requirements of progressive systems. This long latency has significantly limited the use of many machine learning methods, such as t-SNE, in interactive visual analytics. PANENE is a novel algorithm for Progressive Approximate k-NEarest NEighbors, enabling fast KNN queries while continuously indexing new batches of data. Following the progressive computation paradigm, PANENE operations can be bounded in time, allowing analysts to access running results within an interactive latency. PANENE can also incrementally build and maintain a cache data structure, a KNN lookup table, to enable constant-time lookups for KNN queries. Finally, we present three progressive applications of PANENE, such as regression, density estimation, and responsive t-SNE, opening up new opportunities to use complex algorithms in interactive systems.
Similar articles
-
Introduction to machine learning: k-nearest neighbors.Ann Transl Med. 2016 Jun;4(11):218. doi: 10.21037/atm.2016.03.37. Ann Transl Med. 2016. PMID: 27386492 Free PMC article.
-
AVNM: A Voting based Novel Mathematical Rule for Image Classification.Comput Methods Programs Biomed. 2016 Dec;137:195-201. doi: 10.1016/j.cmpb.2016.08.015. Epub 2016 Sep 26. Comput Methods Programs Biomed. 2016. PMID: 28110724
-
Gene expression cancer classification using modified K-Nearest Neighbors technique.Biosystems. 2019 Feb;176:41-51. doi: 10.1016/j.biosystems.2018.12.009. Epub 2019 Jan 3. Biosystems. 2019. PMID: 30611843
-
Survey on Exact kNN Queries over High-Dimensional Data Space.Sensors (Basel). 2023 Jan 5;23(2):629. doi: 10.3390/s23020629. Sensors (Basel). 2023. PMID: 36679422 Free PMC article. Review.
-
Making big data small.Proc Math Phys Eng Sci. 2019 May;475(2225):20190034. doi: 10.1098/rspa.2019.0034. Epub 2019 May 8. Proc Math Phys Eng Sci. 2019. PMID: 31236056 Free PMC article. Review.
Cited by
-
The determinants of investment fraud: A machine learning and artificial intelligence approach.Front Big Data. 2022 Oct 10;5:961039. doi: 10.3389/fdata.2022.961039. eCollection 2022. Front Big Data. 2022. PMID: 36299659 Free PMC article.
LinkOut - more resources
Full Text Sources
Other Literature Sources
