Skip to content
/ vearch Public
forked from vearch/vearch

distributed vector search for AI-native applications

License

Notifications You must be signed in to change notification settings

mrc1119/vearch

 
 

Repository files navigation

License: Apache-2.0 Build Status Gitter

Overview

Vearch is a cloud-native distributed vector database for efficient similarity search of embedding vectors in your AI applications.

Key features

Hybrid search

Both vector search and scalar filtering.

Performance

Fast vector retrieval - search from millions of objects in milliseconds.

Scalability & Reliability

Replication and elastic scaling out.

Document

Quick start

Install Vearch

Add charts through the repo

$ helm repo add vearch https://vearch.github.io/vearch-helm
$ helm repo update && helm install my-release vearch/vearch

Add charts from local

$ git clone https://github.com/vearch/vearch-helm.git
$ cd vearch-helm
$ helm install my-release ./charts -f ./charts/values.yaml

Start by docker-compose

$ cd cloud
$ cp ../config/config.toml .
$ docker-compose up

Deploy by docker

Quickly start with vearch docker image, please see DeployByDocker | docker编译部署.

Compile by source code

Quickly compile the source codes, please see SourceCompileDeployment | 源码编译部署.

APIs and Use Cases

LowLevelAPI

VisualSearchAPI

  • APIVisualSearch.md Vearch can be leveraged to build a complete visual search system to index billions of images. The image retrieval plugin for object detection and feature extraction is also required. For more information, please refer to Quickstart.md.

PythonSDKAPI

  • APIPythonSDK.md Vearch Python SDK enables vearch to use locally. Vearch python sdk can be installed easily by pip install vearch.

Components

Vearch Architecture

arc

Master Responsible for schema mananagement, cluster-level metadata, and resource coordination.
Router Provides RESTful API: `upsert` , `delete` `search` and `query` ; request routing, and result merging.
PartitionServer (PS) Hosts document partitions with raft-based replication.

Gamma is the core vector search engine implemented based on faiss. It provides the ability of storing, indexing and retrieving the vectors and scalars.

Reference

Reference to cite when you use Vearch in a research paper:

@misc{li2019design,
      title={The Design and Implementation of a Real Time Visual Search System on JD E-commerce Platform}, 
      author={Jie Li and Haifeng Liu and Chuanghua Gui and Jianyu Chen and Zhenyun Ni and Ning Wang},
      year={2019},
      eprint={1908.07389},
      archivePrefix={arXiv},
      primaryClass={cs.IR}
}

Community

You can report bugs or ask questions in the issues page of the repository.

For public discussion of Vearch or for questions, you can also send email to vearch-maintainers@groups.io.

Our slack : https://vearchwrokspace.slack.com

Known Users

Welcome to register the company name in this issue: vearch#230 (in order of registration)

欢迎在此 issue vearch#230 中登记公司名称

Users

License

Licensed under the Apache License, Version 2.0. For detail see LICENSE and NOTICE.

About

distributed vector search for AI-native applications

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Go 44.5%
  • C++ 38.3%
  • Python 13.1%
  • Jupyter Notebook 1.2%
  • CMake 1.1%
  • Shell 0.8%
  • Other 1.0%