Asteroid Cloud

Asteroid Cloud is a memory-friendly vector database for AI workloads like RAG and semantic search. You bring embeddings; Asteroid stores vectors and metadata, then serves k-NN search with filters.

In the pilot, you access it over HTTPS with an API key.

Use the Python client or call the HTTP API directly:

Your endpoint, API key, and vector dimension are provided when your pilot is set up. Don't have them yet? Request a pilot.

Quickstart

Install the client, connect, insert vectors, and search. Your vector dimension D is fixed when your pilot instance is provisioned.

Install the Python client (or skip it and use curl / any HTTP client):

pip install lsmvec-client
from lsmvec_client import Client

client = Client(api_key="sk-live-...",
                base_url="https://api.lsmvec.com")
assert client.health()

# insert a few D-dimensional vectors (bring your own embeddings)
client.insert(1, [0.10, 0.20, ...], metadata={"category": "docs"})
client.insert(2, [0.11, 0.19, ...], metadata={"category": "blog"})

# search
hits = client.search([0.10, 0.20, ...], k=10)
for h in hits:
    print(h.id, h.distance)

No collection setup. One pilot instance is one index. Insert vectors and search.

Insert & upsert vectors

insert adds a vector by id with optional JSON metadata. upsert replaces the vector for an id (insert-or-replace). IDs are 64-bit unsigned integers.

client.insert(1, vector, metadata={"title": "intro", "year": 2026})
client.upsert(1, new_vector)   # replace the vector for id 1

Fetch & delete

rec = client.get(1)       # {"id": 1, "vector": [...]}
client.delete(1)

Vectors are stored with SQ8 compression. get returns a dequantized vector, so values may differ slightly from the original input.

Metadata & payloads

Attach JSON metadata to each vector: text, source, tags, prices, timestamps, or any fields your app needs. Read, replace, or merge it without changing the vector.

client.get_payload(1)                       # -> {...}
client.set_payload(1, {"category": "docs"}) # replace
client.merge_payload(1, {"views": 42})      # merge-patch

Search returns the nearest vector ids by distance. Increase ef_search for higher recall at higher latency.

hits = client.search(query_vector, k=10, ef_search=128)
for h in hits:
    print(h.id, h.distance)

Filter by metadata

Combine vector search with metadata filters in one query. Supported operators:

hits = client.search(
    query_vector, k=10,
    filter={"$and": [
        {"category": {"$eq": "docs"}},
        {"year": {"$gte": 2026}},
    ]},
)

Bulk import (initial load)

Two ways to load data, with different memory profiles:

Memory note. Bulk import builds the index in memory, so it uses more RAM during the build. Afterward — and for every insert and query — Asteroid's footprint stays small and flat as usual.

import numpy as np
vectors = np.random.rand(100_000, 128).astype(np.float32)
report = client.bulk_build(vectors, threads=4)
print(report)   # {'n': ..., 'elapsed_ms': ..., 'vectors_per_sec': ...}

Build a RAG app

For RAG, you chunk files and create embeddings with your own model. Asteroid stores the vectors and metadata, then serves filtered k-NN search.

from lsmvec_client import Client
# bring your own embedding model, e.g. fastembed (bge-small-en-v1.5, 384-dim)
from fastembed import TextEmbedding

embed = TextEmbedding("BAAI/bge-small-en-v1.5")
client = Client(api_key="sk-live-...", base_url="https://api.lsmvec.com")

# 1. chunk your documents locally (your code), then embed + insert
for i, chunk in enumerate(chunks):
    vec = list(embed.embed([chunk.text]))[0]
    client.insert(i, vec, metadata={"text": chunk.text, "source": chunk.source})

# 2. at query time: embed the question, retrieve with an optional filter
qvec = list(embed.embed([question]))[0]
hits = client.search(qvec, k=5, filter={"source": {"$eq": "handbook"}})

# 3. fetch the chunk text from payloads and feed it to your LLM
context = "\n".join(client.get_payload(h.id)["text"] for h in hits)
# answer = your_llm(question, context)

Bring your own embeddings. Asteroid stores and searches vectors; it does not embed text server-side.

Next: Asteroid Database (LSM-Vec) — embed or self-host the engine · Reference & Deployment.