Vector stores | 🦜️🔗 LangChain

📄️ Activeloop Deep Lake

Activeloop Deep Lake as a Multi-Modal Vector Store that stores embeddings and their metadata including text, Jsons, images, audio, video, and more. It saves the data locally, in your cloud, or on Activeloop storage. It performs hybrid search including embeddings and their attributes.

📄️ Aerospike

Aerospike Vector Search (AVS) is an

📄️ Alibaba Cloud OpenSearch

Alibaba Cloud Opensearch is a one-stop platform to develop intelligent search services. OpenSearch was built on the large-scale distributed search engine developed by Alibaba. OpenSearch serves more than 500 business cases in Alibaba Group and thousands of Alibaba Cloud customers. OpenSearch helps develop search services in different search scenarios, including e-commerce, O2O, multimedia, the content industry, communities and forums, and big data query in enterprises.

📄️ AnalyticDB

AnalyticDB for PostgreSQL is a massively parallel processing (MPP) data warehousing service that is designed to analyze large volumes of data online.

📄️ Annoy

Annoy (Approximate Nearest Neighbors Oh Yeah) is a C++ library with Python bindings to search for points in space that are close to a given query point. It also creates large read-only file-based data structures that are mmapped into memory so that many processes may share the same data.

📄️ Apache Doris

Apache Doris is a modern data warehouse for real-time analytics.

📄️ Astra DB

This page provides a quickstart for using Astra DB as a Vector Store.

📄️ Atlas

Atlas is a platform by Nomic made for interacting with both small and internet scale unstructured datasets. It enables anyone to visualize, search, and share massive datasets in their browser.

📄️ AwaDB

AwaDB is an AI Native database for the search and storage of embedding vectors used by LLM Applications.

📄️ Azure Cosmos DB Mongo vCore

This notebook shows you how to leverage this integrated vector database to store documents in collections, create indicies and perform vector search queries using approximate nearest neighbor algorithms such as COS (cosine distance), L2 (Euclidean distance), and IP (inner product) to locate documents close to the query vectors.

📄️ Azure Cosmos DB No SQL

📄️ Azure AI Search

Azure AI Search (formerly known as Azure Search and Azure Cognitive Search) is a cloud search service that gives developers infrastructure, APIs, and tools for information retrieval of vector, keyword, and hybrid queries at scale.

📄️ Bagel

Bagel (Open Inference platform for AI), is like GitHub for AI data.

📄️ BagelDB

BagelDB (Open Vector Database for AI), is like GitHub for AI data.

📄️ Baidu Cloud ElasticSearch VectorSearch

Baidu Cloud VectorSearch is a fully managed, enterprise-level distributed search and analysis service which is 100% compatible to open source. Baidu Cloud VectorSearch provides low-cost, high-performance, and reliable retrieval and analysis platform level product services for structured/unstructured data. As a vector database , it supports multiple index types and similarity distance methods.

📄️ Baidu VectorDB

Baidu VectorDB is a robust, enterprise-level distributed database service, meticulously developed and fully managed by Baidu Intelligent Cloud. It stands out for its exceptional ability to store, retrieve, and analyze multi-dimensional vector data. At its core, VectorDB operates on Baidu's proprietary "Mochow" vector database kernel, which ensures high performance, availability, and security, alongside remarkable scalability and user-friendliness.

📄️ Apache Cassandra

This page provides a quickstart for using Apache Cassandra® as a Vector Store.

📄️ Chroma

Chroma is a AI-native open-source vector database focused on developer productivity and happiness. Chroma is licensed under Apache 2.0.

📄️ Clarifai

Clarifai is an AI Platform that provides the full AI lifecycle ranging from data exploration, data labeling, model training, evaluation, and inference. A Clarifai application can be used as a vector database after uploading inputs.

📄️ ClickHouse

ClickHouse is the fastest and most resource efficient open-source database for real-time apps and analytics with full SQL support and a wide range of functions to assist users in writing analytical queries. Lately added data structures and distance search functions (like L2Distance) as well as approximate nearest neighbor search indexes enable ClickHouse to be used as a high performance and scalable vector database to store and search vectors with SQL.

📄️ Couchbase

Couchbase is an award-winning distributed NoSQL cloud database that delivers unmatched versatility, performance, scalability, and financial value for all of your cloud, mobile, AI, and edge computing applications. Couchbase embraces AI with coding assistance for developers and vector search for their applications.

📄️ DashVector

DashVector is a fully-managed vectorDB service that supports high-dimension dense and sparse vectors, real-time insertion and filtered search. It is built to scale automatically and can adapt to different application requirements.

📄️ Databricks Vector Search

Databricks Vector Search is a serverless similarity search engine that allows you to store a vector representation of your data, including metadata, in a vector database. With Vector Search, you can create auto-updating vector search indexes from Delta tables managed by Unity Catalog and query them with a simple API to return the most similar vectors.

📄️ DingoDB

DingoDB is a distributed multi-mode vector database, which combines the characteristics of data lakes and vector databases, and can store data of any type and size (Key-Value, PDF, audio, video, etc.). It has real-time low-latency processing capabilities to achieve rapid insight and response, and can efficiently conduct instant analysis and process multi-modal data.

📄️ DocArray HnswSearch

DocArrayHnswSearch is a lightweight Document Index implementation provided by Docarray that runs fully locally and is best suited for small- to medium-sized datasets. It stores vectors on disk in hnswlib, and stores all other data in SQLite.

📄️ DocArray InMemorySearch

DocArrayInMemorySearch is a document index provided by Docarray that stores documents in memory. It is a great starting point for small datasets, where you may not want to launch a database server.

📄️ Amazon Document DB

Amazon DocumentDB (with MongoDB Compatibility) makes it easy to set up, operate, and scale MongoDB-compatible databases in the cloud.

📄️ DuckDB

This notebook shows how to use DuckDB as a vector store.

📄️ China Mobile ECloud ElasticSearch VectorSearch

China Mobile ECloud VectorSearch is a fully managed, enterprise-level distributed search and analysis service. China Mobile ECloud VectorSearch provides low-cost, high-performance, and reliable retrieval and analysis platform level product services for structured/unstructured data. As a vector database , it supports multiple index types and similarity distance methods.

📄️ Elasticsearch

Elasticsearch is a distributed, RESTful search and analytics engine, capable of performing both vector and lexical search. It is built on top of the Apache Lucene library.

📄️ Epsilla

Epsilla is an open-source vector database that leverages the advanced parallel graph traversal techniques for vector indexing. Epsilla is licensed under GPL-3.0.

📄️ Faiss

Facebook AI Similarity Search (Faiss) is a library for efficient similarity search and clustering of dense vectors. It contains algorithms that search in sets of vectors of any size, up to ones that possibly do not fit in RAM. It also contains supporting code for evaluation and parameter tuning.

📄️ Faiss (Async)

📄️ Google AlloyDB for PostgreSQL

AlloyDB is a fully managed relational database service that offers high performance, seamless integration, and impressive scalability. AlloyDB is 100% compatible with PostgreSQL. Extend your database application to build AI-powered experiences leveraging AlloyDB's Langchain integrations.

📄️ Google BigQuery Vector Search

Google Cloud BigQuery Vector Search lets you use GoogleSQL to do semantic search, using vector indexes for fast approximate results, or using brute force for exact results.

📄️ Google Cloud SQL for MySQL

Cloud SQL is a fully managed relational database service that offers high performance, seamless integration, and impressive scalability. It offers PostgreSQL, MySQL, and SQL Server database engines. Extend your database application to build AI-powered experiences leveraging Cloud SQL's LangChain integrations.

📄️ Google Cloud SQL for PostgreSQL

Cloud SQL is a fully managed relational database service that offers high performance, seamless integration, and impressive scalability. It offers PostgreSQL, PostgreSQL, and SQL Server database engines. Extend your database application to build AI-powered experiences leveraging Cloud SQL's Langchain integrations.

📄️ Firestore

Firestore is a serverless document-oriented database that scales to meet any demand. Extend your database application to build AI-powered experiences leveraging Firestore's Langchain integrations.

📄️ Google Memorystore for Redis

Google Memorystore for Redis is a fully-managed service that is powered by the Redis in-memory data store to build application caches that provide sub-millisecond data access. Extend your database application to build AI-powered experiences leveraging Memorystore for Redis's Langchain integrations.

📄️ Google Spanner

Spanner is a highly scalable database that combines unlimited scalability with relational semantics, such as secondary indexes, strong consistency, schemas, and SQL providing 99.999% availability in one easy solution.

📄️ Google Vertex AI Vector Search

This notebook shows how to use functionality related to the Google Cloud Vertex AI Vector Search vector database.

📄️ Hippo

Transwarp Hippo is an enterprise-level cloud-native distributed vector database that supports storage, retrieval, and management of massive vector-based datasets. It efficiently solves problems such as vector similarity search and high-density vector clustering. Hippo features high availability, high performance, and easy scalability. It has many functions, such as multiple vector search indexes, data partitioning and sharding, data persistence, incremental data ingestion, vector scalar field filtering, and mixed queries. It can effectively meet the high real-time search demands of enterprises for massive vector data

📄️ Hologres

Hologres is a unified real-time data warehousing service developed by Alibaba Cloud. You can use Hologres to write, update, process, and analyze large amounts of data in real time.

📄️ Infinispan

Infinispan is an open-source key-value data grid, it can work as single node as well as distributed.

📄️ Jaguar Vector Database

1. It is a distributed vector database

📄️ KDB.AI

KDB.AI is a powerful knowledge-based vector database and search engine that allows you to build scalable, reliable AI applications, using real-time data, by providing advanced search, recommendation and personalization.

📄️ Kinetica

Kinetica is a database with integrated support for vector similarity search

📄️ LanceDB

LanceDB is an open-source database for vector-search built with persistent storage, which greatly simplifies retrevial, filtering and management of embeddings. Fully open source.

📄️ Lantern

Lantern is an open-source vector similarity search for Postgres

📄️ LLMRails

LLMRails is a API platform for building GenAI applications. It provides an easy-to-use API for document indexing and querying that is managed by LLMRails and is optimized for performance and accuracy.

📄️ ManticoreSearch VectorStore

ManticoreSearch is an open-source search engine that offers fast, scalable, and user-friendly capabilities. Originating as a fork of Sphinx Search, it has evolved to incorporate modern search engine features and improvements. ManticoreSearch distinguishes itself with its robust performance and ease of integration into various applications.

📄️ Marqo

This notebook shows how to use functionality related to the Marqo vectorstore.

📄️ Meilisearch

Meilisearch is an open-source, lightning-fast, and hyper relevant search engine. It comes with great defaults to help developers build snappy search experiences.

📄️ Milvus

Milvus is a database that stores, indexes, and manages massive embedding vectors generated by deep neural networks and other machine learning (ML) models.

📄️ Momento Vector Index (MVI)

MVI: the most productive, easiest to use, serverless vector index for your data. To get started with MVI, simply sign up for an account. There's no need to handle infrastructure, manage servers, or be concerned about scaling. MVI is a service that scales automatically to meet your needs.

📄️ MongoDB Atlas

This notebook covers how to MongoDB Atlas vector search in LangChain, using the langchain-mongodb package.

📄️ MyScale

MyScale is a cloud-based database optimized for AI applications and solutions, built on the open-source ClickHouse.

📄️ Neo4j Vector Index

Neo4j is an open-source graph database with integrated support for vector similarity search

📄️ NucliaDB

You can use a local NucliaDB instance or use Nuclia Cloud.

📄️ OpenSearch

OpenSearch is a scalable, flexible, and extensible open-source software suite for search, analytics, and observability applications licensed under Apache 2.0. OpenSearch is a distributed search and analytics engine based on Apache Lucene.

📄️ Oracle AI Vector Search: Vector Store

Oracle AI Vector Search is designed for Artificial Intelligence (AI) workloads that allows you to query data based on semantics, rather than keywords.

📄️ Pathway

Pathway is an open data processing framework. It allows you to easily develop data transformation pipelines and Machine Learning applications that work with live data sources and changing data.

📄️ Postgres Embedding

Postgres Embedding is an open-source vector similarity search for Postgres that uses Hierarchical Navigable Small Worlds (HNSW) for approximate nearest neighbor search.

📄️ PGVecto.rs

This notebook shows how to use functionality related to the Postgres vector database (pgvecto.rs).

📄️ PGVector

An implementation of LangChain vectorstore abstraction using postgres as the backend and utilizing the pgvector extension.

📄️ Pinecone

Pinecone is a vector database with broad functionality.

📄️ Qdrant

Qdrant (read: quadrant ) is a vector similarity search engine. It provides a production-ready service with a convenient API to store, search, and manage points - vectors with an additional payload. Qdrant is tailored to extended filtering support. It makes it useful for all sorts of neural network or semantic-based matching, faceted search, and other applications.

📄️ Redis

Redis vector database introduction and langchain integration guide.

📄️ Relyt

Relyt is a cloud native data warehousing service that is designed to analyze large volumes of data online.

📄️ Rockset

Rockset is a real-time search and analytics database built for the cloud. Rockset uses a Converged Index™ with an efficient store for vector embeddings to serve low latency, high concurrency search queries at scale. Rockset has full support for metadata filtering and handles real-time ingestion for constantly updating, streaming data.

📄️ SAP HANA Cloud Vector Engine

SAP HANA Cloud Vector Engine is a vector store fully integrated into the SAP HANA Cloud database.

📄️ ScaNN

ScaNN (Scalable Nearest Neighbors) is a method for efficient vector similarity search at scale.

📄️ SemaDB

SemaDB from SemaFind is a no fuss vector similarity database for building AI applications. The hosted SemaDB Cloud offers a no fuss developer experience to get started.

📄️ SingleStoreDB

SingleStoreDB is a robust, high-performance distributed SQL database solution designed to excel in both cloud and on-premises environments. Boasting a versatile feature set, it offers seamless deployment options while delivering unparalleled performance.

📄️ scikit-learn

scikit-learn is an open-source collection of machine learning algorithms, including some implementations of the k nearest neighbors. SKLearnVectorStore wraps this implementation and adds the possibility to persist the vector store in json, bson (binary json) or Apache Parquet format.

📄️ SQLite-VSS

SQLite-VSS is an SQLite extension designed for vector search, emphasizing local-first operations and easy integration into applications without external servers. Leveraging the Faiss library, it offers efficient similarity search and clustering capabilities.

📄️ StarRocks

StarRocks is a High-Performance Analytical Database.

📄️ Supabase (Postgres)

Supabase is an open-source Firebase alternative. Supabase is built on top of PostgreSQL, which offers strong SQL querying capabilities and enables a simple interface with already-existing tools and frameworks.

📄️ SurrealDB

SurrealDB is an end-to-end cloud-native database designed for modern applications, including web, mobile, serverless, Jamstack, backend, and traditional applications. With SurrealDB, you can simplify your database and API infrastructure, reduce development time, and build secure, performant apps quickly and cost-effectively.

📄️ Tair

Tair is a cloud native in-memory database service developed by Alibaba Cloud.

📄️ Tencent Cloud VectorDB

Tencent Cloud VectorDB is a fully managed, self-developed, enterprise-level distributed database service designed for storing, retrieving, and analyzing multi-dimensional vector data. The database supports multiple index types and similarity calculation methods. A single index can support a vector scale of up to 1 billion and can support millions of QPS and millisecond-level query latency. Tencent Cloud Vector Database can not only provide an external knowledge base for large models to improve the accuracy of large model responses but can also be widely used in AI fields such as recommendation systems, NLP services, computer vision, and intelligent customer service.

📄️ ThirdAI NeuralDB

NeuralDB is a CPU-friendly and fine-tunable vector store developed by ThirdAI.

📄️ TiDB Vector

TiDB Cloud, is a comprehensive Database-as-a-Service (DBaaS) solution, that provides dedicated and serverless options. TiDB Serverless is now integrating a built-in vector search into the MySQL landscape. With this enhancement, you can seamlessly develop AI applications using TiDB Serverless without the need for a new database or additional technical stacks. Be among the first to experience it by joining the waitlist for the private beta at https://tidb.cloud/ai.

📄️ Tigris

Tigris is an open-source Serverless NoSQL Database and Search Platform designed to simplify building high-performance vector search applications.

📄️ TileDB

TileDB is a powerful engine for indexing and querying dense and sparse multi-dimensional arrays.

📄️ Timescale Vector (Postgres)

Timescale Vector is PostgreSQL++ vector database for AI applications.

📄️ Typesense

Typesense is an open-source, in-memory search engine, that you can either self-host or run on Typesense Cloud.

📄️ Upstash Vector

Upstash Vector is a serverless vector database designed for working with vector embeddings.

📄️ USearch

USearch is a Smaller & Faster Single-File Vector Search Engine

📄️ Vald

Vald is a highly scalable distributed fast approximate nearest neighbor (ANN) dense vector search engine.

📄️ Intel's Visual Data Management System (VDMS)

Intel's VDMS is a storage solution for efficient access of big-”visual”-data that aims to achieve cloud scale by searching for relevant visual data via visual metadata stored as a graph and enabling machine friendly enhancements to visual data for faster access. VDMS is licensed under MIT.

📄️ Vearch

Vearch is the vector search infrastructure for deeping learning and AI applications.

📄️ Vectara

Vectara provides a Trusted Generative AI platform, allowing organizations to rapidly create a ChatGPT-like experience (an AI assistant) which is grounded in the data, documents, and knowledge that they have (technically, it is Retrieval-Augmented-Generation-as-a-service).

📄️ Vespa

Vespa is a fully featured search engine and vector database. It supports vector search (ANN), lexical search, and search in structured data, all in the same query.

📄️ viking DB

viking DB is a database that stores, indexes, and manages massive embedding vectors generated by deep neural networks and other machine learning (ML) models.

📄️ vlite

VLite is a simple and blazing fast vector database that allows you to store and retrieve data semantically using embeddings. Made with numpy, vlite is a lightweight batteries-included database to implement RAG, similarity search, and embeddings into your projects.

📄️ Weaviate

This notebook covers how to get started with the Weaviate vector store in LangChain, using the langchain-weaviate package.

📄️ Xata

Xata is a serverless data platform, based on PostgreSQL. It provides a Python SDK for interacting with your database, and a UI for managing your data.

📄️ Yellowbrick

Yellowbrick is an elastic, massively parallel processing (MPP) SQL database that runs in the cloud and on-premises, using kubernetes for scale, resilience and cloud portability. Yellowbrick is designed to address the largest and most complex business-critical data warehousing use cases. The efficiency at scale that Yellowbrick provides also enables it to be used as a high performance and scalable vector database to store and search vectors with SQL.