Needlestack

https://img.shields.io/pypi/v/needlestack.svg https://img.shields.io/travis/needlehaystack/needlestack.svg https://coveralls.io/repos/github/needlehaystack/needlestack/badge.svg?branch=master Documentation Status

Needlestack is a distributed vector search microservice.

Features

  • gRPC server for kNN vector search

  • Shard vectors over multiple nodes

  • Replicate shard over multiple nodes

  • Retrieve vectors by ID

Limitations

The current beta builds have limitations that make them difficult to use in production. These should be addressed in future builds.

Caveats

  • Vectors must be manually sharded, indexed, and serialized to disk as protobufs

  • Only kNN library currently supported is Faiss

Quickstart

Get started with the examples in this repo!

Start Docker containers running Needlestack services. This runs the examples/run_merger.py and examples/run_searcher.py in containers.

docker-compose up merger-grpc searcher-grpc1 searcher-grpc2 searcher-grpc3

Create local index data and send to the Needlestack services. This runs examples/indexing_job.py to create dummy data, then runs examples/add_collections.py to send them to the Needlestack service.

docker-compose run --rm make-test-data

Access the gRPC endpoints at localhost:50051