Skip to content

A list of papers in the field of approximate nearest neighbor search on high-dimensional vectors.

License

Notifications You must be signed in to change notification settings

SimoneZeng/awesome-vector-ANN-search-papers

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 

Repository files navigation

awesome-vector-ANN-search-papers

A list of papers in the field of approximate nearest neighbor search on high-dimensional vectors.

We refine papers according to categories and research groups. Other reading reference and notes for ANN study is coming soon.

Let's dive into ANN search and vector database!

Introduction of ANN search

In general, there are many great articles in the website of Pinecone and Zilliz, including core components, deep dives, user cases and ML foundations in the field of vector databases.

If you are a Chinese developer and want to learn about the getting-started concepts and techniques in ANN search with vector datasets, here is a blog that contains most of the concepts for a beginner.

Papers Refined with Categories

In this section, we provide papers refined with categories. We also provide common abbreviation of methods after the title with Abbr.

0. Upcoming papers

Here are the accepted papers in SIGMOD2025 in the field of vector search. We are looking forward to following these papers and categorizing them correctly once they are public!

Title Venue Authors Link
SymphonyQG: towards Symphonious Integration of Quantization and Graph for Approximate Nearest Neighbor Search Gou et al. SIGMOD2025
DEG: Efficient Hybrid Vector Search Using the Dynamic Edge Navigation Graph Yin et al. SIGMOD2025
Tribase: A Vector Data Query Engine for Reliable and Lossless Pruning Compression using Triangle Inequalities Xu et al. SIGMOD2025
Graph-Based Vector Search: An Experimental Evaluation of the State-of-the-Art Azizi et al. SIGMOD2025
Navigating Labels and Vectors: A Unified Approach to Filtered Approximate Nearest Neighbor Search Cai et al. SIGMOD2025
Subspace Collision: An Efficient and Accurate Framework for High-dimensional Approximate Nearest Neighbor Search Wei et al. SIGMOD2025

1. Graph-based

This category collects papers that propose graph-based methods, without combining other three types of categories (e.g., tree-based, hash-based, quantization-based).

Title Venue Authors Link
Approximate nearest neighbor algorithm based on navigable small world graphs (Abbr. NSW) IS2014 Malkov et al. link
Efficient and Robust Approximate Nearest Neighbor Search Using Hierarchical Navigable Small World Graphs (Abbr. HNSW) TPAMI2018 Malkov et al. link
Fast Approximate Nearest Neighbor Search With The Navigating Spreading-out Graph (Abbr. NSG) VLDB2019 Fu et al. link
High Dimensional Similarity Search with Satellite System Graph: Efficiency, Scalability, and Unindexed Query Compatibility (Abbr. NSSG) TPAMI2021 Fu et al. link
Improving Approximate Nearest Neighbor Search through Learned Adaptive Early Termination SIGMOD2020 Li et al. link
Learning to Route in Similarity Graphs ICML2019 Baranchuk et al. link
Probabilistic Routing for Graph-Based Approximate Nearest Neighbor Search 2024 Lu et al. link
Reinforcement Routing on Proximity Graph for Efficient Recommendation TOIS2023 Feng et al. link
Steiner-Hardness: A Query Hardness Measure for Graph-Based ANN Indexes VLDB2025 Wang et al. link
RoarGraph: A Projected Bipartite Graph for Efficient Cross-Modal Approximate Nearest Neighbor Search VLDB2024 Chen et al. link
ARKGraph: All-Range Approximate K-Nearest-Neighbor Graph VLDB2023 Zuo et al. link
EFANNA : An Extremely Fast Approximate Nearest Neighbor Search Algorithm Based on kNN Graph 2016 Fu et al. link
FANNG: Fast Approximate Nearest Neighbour Graphs CVPR2016 Harwood et al. link

2. Combining graph and other categories

Title Venue Authors Link
ELPIS: Graph-Based Similarity Search for Scalable Data Science VLDB2023 Azizi et al. link
Towards Efficient Index Construction and Approximate Nearest Neighbor Search in High-Dimensional Spaces (Abbr. LSH-APG) VLDB2023 Zhao et al. link
HVS: hierarchical graph structure based on voronoi diagrams for solving approximate nearest neighbor search VLDB2021 Lu et al. link
Routing-Guided Learned Product Quantization for Graph-Based Approximate Nearest Neighbor Search ICDE2024 Yue et al. link

3. Partitions-based and Distributed

Title Venue Authors Link
Learning Space Partitions for Nearest Neighbor Search (Abbr. Neural LSH) ICLR2020 et al. link
BLISS: A Billion scale Index using Iterative Re-partitioning SIGKDD2022 Gupta et al. link
Learning Balanced Tree Indexes for Large-Scale Vector Retrieval SIGKDD2023 Li et al. link
Learned Probing Cardinality Estimation for High-Dimensional Approximate NN Search ICDE2023 Zheng et al. link
Learning-based query optimization for multi-probe approximate nearest neighbor search VLDBJ2023 Zhang et al. link
SOAR: Improved Indexing for Approximate Nearest Neighbor Search NIPS2023 Sun et al. link
Optimizing the Number of Clusters for Billion-Scale Quantization-Based Nearest Neighbor Search TKDE2024 Fu et al. link
DIMS: Distributed Index for Similarity Search in Metric Spaces TKDE2024 Zhu et al. link
Efficient Distributed Approximate k-Nearest Neighbor Graph Construction by Multiway Random Division Forest KDD2023 Kim et al. link
Odyssey: A Journey in the Land of Distributed Data Series Similarity Search VLDB2023 Chatzakis et al. link

4. Quantization-based

Title Venue Authors Link
RaBitQ: Quantizing High-Dimensional Vectors with a Theoretical Error Bound for Approximate Nearest Neighbor Search SIGMOD2024 Gao et al. link
Practical and Asymptotically Optimal Quantization of High-Dimensional Vectors in Euclidean Space for Approximate Nearest Neighbor Search 2024 Gao et al. link
Similarity Search in the Blink of an Eye with Compressed Indices VLDB2023 Aguerrebere et al. link
Model-enhanced Vector Index NeurIPS2023 Zhang et al. link
Knowledge Distillation for High Dimensional Search Index NeurIPS2024 Lu et al. link
Accelerating Large-Scale Inference with Anisotropic Vector Quantization, ScaNN ICML2020 Guo et al. link

5. Hash-based

Title Venue Authors Link
DB-LSH: Locality-Sensitive Hashing with Query-based Dynamic Bucketing ICDE2023 Tian et al. link
DB-LSH 2.0: Locality-Sensitive Hashing With Query-Based Dynamic Bucketing TKDE2023 Tian et al. link
MP-RW-LSH: an efficient multi-probe LSH solution to ANNS-L1 VLDB2021 Wang et al. link
PM-LSH: a fast and accurate in-memory framework for high-dimensional approximate NN and closest pair search VLDB2022 Zheng et al. link
Query-aware locality-sensitive hashing for approximate nearest neighbor search VLDB2015 Huang et al. link
LIDER: an efficient high-dimensional learned index for large-scale dense passage retrieval VLDB2022 Wang et al. link
DET-LSH: A Locality-Sensitive Hashing Scheme with Dynamic Encoding Tree for Approximate Nearest Neighbor Search VLDB2024 Wei et al. link

6. Tree-based

Title Venue Authors Link
DIDS: Double Indices and Double Summarizations for Fast Similarity Search VLDB2024 Hu et al. link
Adaptive Indexing in High-Dimensional Metric Spaces VLDB2023 Lampropoulos et al. link
Hercules Against Data Series Similarity Search VLDB2022 Echihabi et al. link
Scalable Nearest Neighbor Algorithms for High Dimensional Data TPAMI2020 Muja et al. link
i SAX: indexing and mining terabyte sized time series SIGKDD2008 Shieh et al. link

7. Disk available

Title Venue Authors Link
DiskANN: Fast Accurate Billion-point Nearest Neighbor Search on a Single Node NeurIPS2019 Subramanya et al. link
DiskANN++: Efficient Page-based Search over Isomorphic Mapped Graph Index using Query-sensitivity Entry Vertex 2023 Ni et al. link
Filtered − DiskANN: Graph Algorithms for Approximate Nearest Neighbor Search with Filters Web2023 Gollapudi et al. link
FreshDiskANN: A Fast and Accurate Graph-Based ANN Index for Streaming Similarity Search 2021 Singh et al. link
SPANN: Highly-efficient Billion-scale Approximate Nearest Neighbor Search NeurIPS2021 Chen et al. link
SPFresh: Incremental In-Place Update for Billion-Scale Vector Search SOSP2023 Xu et al. link

8. Survey and benchmark

Title Venue Authors Link
A Comprehensive Survey and Experimental Comparison of Graph-Based Approximate Nearest Neighbor Search VLDB2021 Wang et al. link
ParlayANN: Scalable and Deterministic Parallel GraphBased Approximate Nearest Neighbor Search Algorithms PPoPP2024 Manohar et al. link
Deep Learning for Approximate Nearest Neighbour Search: A Survey and Future Directions TKDE2022 Li et al. link
Survey of Vector Database Management Systems VLDB2024 Pan et al. link

9. Hybrid query

Title Venue Authors Link
SeRF: Segment Graph for Range-Filtering Approximate Nearest Neighbor Search SIGMOD2024 Zuo et al. link
ACORN: Performant and Predicate-Agnostic Search Over Vector Embeddings and Structured Data SIGMOD2024 Patel et al. link
Filtered − DiskANN: Graph Algorithms for Approximate Nearest Neighbor Search with Filters Web2023 Gollapudi et al. link
High-Throughput Vector Similarity Search in Knowledge Graphs SIGMOD2023 Mohoney et al. link
Navigable Proximity Graph-Driven Native Hybrid Queries with Structured and Unstructured Constraints 2022 Wang et al. link
AnalyticDB-V: a hybrid analytical engine towards query fusion for structured and unstructured data VLDB2020 Wei et al. link

10. Computation acceleration

Title Venue Authors Link
High-Dimensional Approximate Nearest Neighbor Search: with Reliable and Efficient Distance Comparison Operations (Abbr. ADSampline) SIGMOD2023 Gao et al. link
Accelerating Graph-based Vector Search via Delayed-Synchronization Traversal 2024 Jiang et al. link
Juno: Optimizing High-Dimensional Approximate Nearest Neighbour Search with Sparsity-Aware Algorithm and Ray-Tracing Core Mapping ASPLOS2024 Liu et al. link
FINGER: Fast Inference for Graph-based Approximate Nearest Neighbor Search Web2023 Chen et al. link
Relative NN-Descent: A Fast Index Construction for Graph-Based Approximate Nearest Neighbor Search MM2023 Ono et al. link
AdANNS: A Framework for Adaptive Semantic Search NIPS2023 Rege et al. link

11. Vector database system

Title Venue Authors Link
Milvus: A Purpose-Built Vector Data Management System SIGMOD2021 Wang et al. link
Manu: A Cloud Native Vector Database Management System VLDB2022 Guo et al. link
Vexless: A Serverless Vector Data Management System Using Cloud Functions SIGMOD2024 Su et al. link
Starling: An I/O-Efficient Disk-Resident Graph Index Framework for High-Dimensional Vector Similarity Search on Data Segment SIGMOD2024 Wang et al. link
Efficient Approximate Nearest Neighbor Search in Multi-dimensional Databases SIGMOD2023 Peng et al. link
VBASE: Unifying Online Vector Similarity Search and Relational Queries via Relaxed Monotonicity OSDI2023 Zhang et al. link
LANNS: a web-scale approximate nearest neighbor lookup system VLDB2021 Doshi et al. link
SingleStore-V: An Integrated Vector Database System in SingleStore VLDB2024 Chen et al. link

12. Threoratical

Title Venue Authors Link
Graph-based Nearest Neighbor Search: From Practice to Theory ICML2020 Prokhorenkova et al. [link]

13. Multi-metric spaces

Title Venue Authors Link
HJG: An Effective Hierarchical Joint Graph for ANNS in Multi-Metric Spaces ICDE2024 Zhu et al. [link]

14. Reverse kANN

Title Venue Authors Link
Efficient Reverse k Approximate Nearest Neighbor Search Over High-Dimensional Vectors ICDE2024 Song et al. [link]

Papers Refined with Research Groups

In this section, we provide papers refined with research groups.

  • Yury Malkov, OpenAI

    1. Approximate nearest neighbor algorithm based on navigable small world graphs (IS2014), Malkov et al. [link]
    2. Efficient and Robust Approximate Nearest Neighbor Search Using Hierarchical Navigable Small World Graphs (TPAMI2018), Malkov et al. [link]
  • Zilliz

    1. Milvus: A Purpose-Built Vector Data Management System (SIGMOD2021), Wang et al. [link]
    2. Manu: A Cloud Native Vector Database Management System (VLDB2022), Guo et al. [link]
    3. Starling: An I/O-Efficient Disk-Resident Graph Index Framework for High-Dimensional Vector Similarity Search on Data Segment (SIGMOD2024), Wang et al. [link]
  • Qi Chen, Microsoft Research

    1. SPANN: Highly-efficient Billion-scale Approximate Nearest Neighbor Search (NeurIPS2021), Chen et al. [link]
    2. SPFresh: Incremental In-Place Update for Billion-Scale Vector Search (SOSP2023), Xu et al. [link]
    3. VBASE: Unifying Online Vector Similarity Search and Relational Queries via Relaxed Monotonicity (OSDI2023), Zhang et al. [link]
  • Dong Deng, Rutgers University

    1. SeRF: Segment Graph for Range-Filtering Approximate Nearest Neighbor Search (SIGMOD2024), Zuo et al. [link]
    2. ARKGraph: All-Range Approximate K-Nearest-Neighbor Graph (VLDB2023), Zuo et al. [link]
  • Cong Fu, Zhejiang University

    1. EFANNA : An Extremely Fast Approximate Nearest Neighbor Search Algorithm Based on kNN Graph (2016), Fu et al. [link]
    2. Fast Approximate Nearest Neighbor Search With The Navigating Spreading-out Graph (VLDB2019), Fu et al. [link]
  • Kejing Lu, Nagoya University

    1. Probabilistic Routing for Graph-Based Approximate Nearest Neighbor Search (2024), Lu et al. [link]
    2. HVS: hierarchical graph structure based on voronoi diagrams for solving approximate nearest neighbor search (VLDB2021), Lu et al. [link]
  • Defu Lian, University of Science and Technology of China

    1. Knowledge Distillation for High Dimensional Search Index (NeurIPS2024), Lu et al. [link]
    2. Reinforcement Routing on Proximity Graph for Efficient Recommendation (TOIS2023), Feng et al. [link]
    3. Learning Balanced Tree Indexes for Large-Scale Vector Retrieval (SIGKDD2023), Li et al. [link]
  • Themis Palpanas, LIPADE, Université Paris Cité

    1. Steiner-Hardness: A Query Hardness Measure for Graph-Based ANN Indexes (VLDB2025), Wang et al. [link]
    2. ELPIS: Graph-Based Similarity Search for Scalable Data Science (VLDB2023), Azizi et al. [link]
    3. Hercules Against Data Series Similarity Search (VLDB2022), Echihabi et al. [link]
    4. DET-LSH: A Locality-Sensitive Hashing Scheme with Dynamic Encoding Tree for Approximate Nearest Neighbor Search (VLDB2024), Wei et al. [link]
  • Guoliang Li, Tsinghua University

    1. Survey of Vector Database Management Systems (VLDB2024), Pan et al. [link]
  • Bin Cui, Peking University

    1. Model-enhanced Vector Index (NeurIPS2023), Zhang et al. [link]
  • Xi Zhao, Xiaofang Zhou, Hong Kong University of Science and Technology

    1. Towards Efficient Index Construction and Approximate Nearest Neighbor Search in High-Dimensional Spaces (VLDB2023), Zhao et al. [link]
    2. DB-LSH: Locality-Sensitive Hashing with Query-based Dynamic Bucketing (ICDE2023), Tian et al. [link]
    3. DB-LSH 2.0: Locality-Sensitive Hashing With Query-Based Dynamic Bucketing (TKDE2023), Tian et al. [link]
  • Xiaoliang Xu, Yuxiang Wang, Hangzhou Dianzi University

    1. Routing-Guided Learned Product Quantization for Graph-Based Approximate Nearest Neighbor Search (ICDE2024), Yue et al. [link]
    2. DiskANN++: Efficient Page-based Search over Isomorphic Mapped Graph Index using Query-sensitivity Entry Vertex (2023), Ni et al. [link]
    3. Navigable Proximity Graph-Driven Native Hybrid Queries with Structured and Unstructured Constraints (2022), Wang et al. [link]
    4. Starling: An I/O-Efficient Disk-Resident Graph Index Framework for High-Dimensional Vector Similarity Search on Data Segment (SIGMOD2024), Wang et al. [link]
  • Jianyang Gao, Cheng Long, Nanyang Technological University

    1. RaBitQ: Quantizing High-Dimensional Vectors with a Theoretical Error Bound for Approximate Nearest Neighbor Search (SIGMOD2024), Gao et al. [link]
    2. High-Dimensional Approximate Nearest Neighbor Search: with Reliable and Efficient Distance Comparison Operations (SIGMOD2023), Gao et al. [link]
  • Wei Wang, Fudan University

    1. Steiner-Hardness: A Query Hardness Measure for Graph-Based ANN Indexes (VLDB2025), Wang et al. [link]
  • Wei Wang, Hong Kong University of Science and Technology (Guangzhou)

    1. Deep Learning for Approximate Nearest Neighbour Search: A Survey and Future Directions (TKDE2022), Li et al. [link]
  • Pengcheng Zhang, Bin Yao, Shanghai Jiao Tong University

    1. Learning-based query optimization for multi-probe approximate nearest neighbor search (VLDBJ2023), Zhang et al. [link]
  • Bolong Zheng, Huazhong University of Science and Technology

    1. Learned Probing Cardinality Estimation for High-Dimensional Approximate NN Search (ICDE2023), Zheng et al. [link]
    2. PM-LSH: a fast and accurate in-memory framework for high-dimensional approximate NN and closest pair search (VLDB2022), Zheng et al. [link]
  • Jianguo Wang, Purdue University

    1. Vexless: A Serverless Vector Data Management System Using Cloud Functions (VLDB2024), Su et al. [link]
  • Lu Chen, Zhejiang University

    1. HJG: An Effective Hierarchical Joint Graph for ANNS in Multi-Metric Spaces (ICDE2024), Zhu et al. [link]
  • Bin Yao, Shanghai Jiao Tong University

    1. Efficient Reverse k Approximate Nearest Neighbor Search Over High-Dimensional Vectors (ICDE2024), Song et al. [link]

About

A list of papers in the field of approximate nearest neighbor search on high-dimensional vectors.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published