Libraries, databases, and models for generating, storing, and searching dense vector embeddings — the backbone of semantic search, RAG pipelines, and similarity-based retrieval.
Resources
| Field | Category | Date | Link | Notes |
|---|---|---|---|---|
| Embedding Models | Libraries | 2026 | gte-pure-C | Pure C implementation of the GTE-small text embedding model (dependency-free, 384-dim) focused on semantic similarity and search |
| 2024 | txtai | embeddings database for semantic search, LLM orchestration, and RAG pipelines |
||
| Search Retrieval | 2026 | nndex | high-performance Rust nearest-neighbour vector search with Python bindings, SIMD/rayon CPU backend and wgpu GPU backend |
|
| Tools | doppelgangers | clusters GitHub issues and PRs with embeddings and UMAP for visual triage |
||
| Vector Databases | Libraries | Zvec | in-process vector database built on Alibaba’s Proxima engine for low-latency similarity search |
|
| 2025 | OctaneDB | lightweight Python vector database with HNSW indexing and flexible storage options |
||
| 2024 | tinkerbird | vector database atop IndexedDB for in-browser semantic search |
||
| 2023 | pgvector-python | Python client for using Postgres as a vector database via pgvector |