Asteroid Cloud
Asteroid Cloud is a memory-friendly vector database for AI workloads like RAG and semantic search. You bring embeddings; Asteroid stores vectors and metadata, then serves k-NN search with filters.
In the pilot, you access it over HTTPS with an API key.
Use the Python client or call the HTTP API directly:
- Python client —
pip install lsmvec-client - HTTP API — a plain REST API you can call with
curlor any language.
Your endpoint, API key, and vector dimension are provided when your pilot is set up. Don't have them yet? Request a pilot.
Quickstart
Install the client, connect, insert vectors, and search. Your vector
dimension D is fixed when your pilot instance is provisioned.
Install the Python client (or skip it and use curl / any
HTTP client):
pip install lsmvec-client
from lsmvec_client import Client
client = Client(api_key="sk-live-...",
base_url="https://api.lsmvec.com")
assert client.health()
# insert a few D-dimensional vectors (bring your own embeddings)
client.insert(1, [0.10, 0.20, ...], metadata={"category": "docs"})
client.insert(2, [0.11, 0.19, ...], metadata={"category": "blog"})
# search
hits = client.search([0.10, 0.20, ...], k=10)
for h in hits:
print(h.id, h.distance)# insert
curl -sX POST https://api.lsmvec.com/v1/vectors \
-H "Authorization: Bearer sk-live-..." \
-H "Content-Type: application/json" \
-d '{"id": 1, "vector": [0.10, 0.20, ...], "metadata": {"category": "docs"}}'
# search
curl -sX POST https://api.lsmvec.com/v1/search \
-H "Authorization: Bearer sk-live-..." \
-H "Content-Type: application/json" \
-d '{"vector": [0.10, 0.20, ...], "k": 10}'No collection setup. One pilot instance is one index. Insert vectors and search.
Insert & upsert vectors
insert adds a vector by id with optional JSON metadata.
upsert replaces the vector for an id (insert-or-replace). IDs
are 64-bit unsigned integers.
client.insert(1, vector, metadata={"title": "intro", "year": 2026})
client.upsert(1, new_vector) # replace the vector for id 1# insert (POST)
curl -sX POST https://api.lsmvec.com/v1/vectors \
-H "Authorization: Bearer sk-live-..." -H "Content-Type: application/json" \
-d '{"id": 1, "vector": [...], "metadata": {"title": "intro", "year": 2026}}'
# upsert (PUT) — replace the vector for id 1
curl -sX PUT https://api.lsmvec.com/v1/vectors/1 \
-H "Authorization: Bearer sk-live-..." -H "Content-Type: application/json" \
-d '{"vector": [...]}'Fetch & delete
rec = client.get(1) # {"id": 1, "vector": [...]}
client.delete(1)curl -s https://api.lsmvec.com/v1/vectors/1 -H "Authorization: Bearer sk-live-..."
curl -sX DELETE https://api.lsmvec.com/v1/vectors/1 -H "Authorization: Bearer sk-live-..."Vectors are stored with SQ8 compression. get returns a
dequantized vector, so values may differ slightly from the original input.
Metadata & payloads
Attach JSON metadata to each vector: text, source, tags, prices, timestamps, or any fields your app needs. Read, replace, or merge it without changing the vector.
client.get_payload(1) # -> {...}
client.set_payload(1, {"category": "docs"}) # replace
client.merge_payload(1, {"views": 42}) # merge-patchcurl -s https://api.lsmvec.com/v1/vectors/1/payload -H "Authorization: Bearer sk-live-..."
curl -sX PUT https://api.lsmvec.com/v1/vectors/1/payload -H "Authorization: Bearer sk-live-..." \
-H "Content-Type: application/json" -d '{"category": "docs"}'
curl -sX PATCH https://api.lsmvec.com/v1/vectors/1/payload -H "Authorization: Bearer sk-live-..." \
-H "Content-Type: application/json" -d '{"views": 42}'Search
Search returns the nearest vector ids by distance. Increase
ef_search for higher recall at higher latency.
hits = client.search(query_vector, k=10, ef_search=128)
for h in hits:
print(h.id, h.distance)curl -sX POST https://api.lsmvec.com/v1/search \
-H "Authorization: Bearer sk-live-..." -H "Content-Type: application/json" \
-d '{"vector": [...], "k": 10, "ef_search": 128}'Filter by metadata
Combine vector search with metadata filters in one query. Supported operators:
- Comparison:
$eq$ne$gt$gte$lt$lte - Membership:
$in$nin$contains_any$contains_all - Existence:
$exists(boolean) - Logical:
$and$or
hits = client.search(
query_vector, k=10,
filter={"$and": [
{"category": {"$eq": "docs"}},
{"year": {"$gte": 2026}},
]},
)curl -sX POST https://api.lsmvec.com/v1/search \
-H "Authorization: Bearer sk-live-..." -H "Content-Type: application/json" \
-d '{"vector": [...], "k": 10,
"filter": {"$and": [{"category": {"$eq": "docs"}},
{"year": {"$gte": 2026}}]}}'Bulk import (initial load)
Two ways to load data, with different memory profiles:
- Incremental (on-disk).
insert/upsertone vector at a time — small, flat memory. Use it for streaming writes and updates to an existing instance. - Bulk import (in-memory). For a new, empty instance, build the whole index in one pass (RNN-Descent) — faster for a large initial load.
Memory note. Bulk import builds the index in memory, so it uses more RAM during the build. Afterward — and for every insert and query — Asteroid's footprint stays small and flat as usual.
import numpy as np
vectors = np.random.rand(100_000, 128).astype(np.float32)
report = client.bulk_build(vectors, threads=4)
print(report) # {'n': ..., 'elapsed_ms': ..., 'vectors_per_sec': ...}# raw float32 body; N and dim in headers
curl -sX POST https://api.lsmvec.com/v1/build/bulk \
-H "Authorization: Bearer sk-live-..." \
-H "X-LSMVec-N: 100000" -H "X-LSMVec-Dim: 128" \
-H "Content-Type: application/octet-stream" \
--data-binary @vectors.f32Build a RAG app
For RAG, you chunk files and create embeddings with your own model. Asteroid stores the vectors and metadata, then serves filtered k-NN search.
from lsmvec_client import Client
# bring your own embedding model, e.g. fastembed (bge-small-en-v1.5, 384-dim)
from fastembed import TextEmbedding
embed = TextEmbedding("BAAI/bge-small-en-v1.5")
client = Client(api_key="sk-live-...", base_url="https://api.lsmvec.com")
# 1. chunk your documents locally (your code), then embed + insert
for i, chunk in enumerate(chunks):
vec = list(embed.embed([chunk.text]))[0]
client.insert(i, vec, metadata={"text": chunk.text, "source": chunk.source})
# 2. at query time: embed the question, retrieve with an optional filter
qvec = list(embed.embed([question]))[0]
hits = client.search(qvec, k=5, filter={"source": {"$eq": "handbook"}})
# 3. fetch the chunk text from payloads and feed it to your LLM
context = "\n".join(client.get_payload(h.id)["text"] for h in hits)
# answer = your_llm(question, context)Bring your own embeddings. Asteroid stores and searches vectors; it does not embed text server-side.
Next: Asteroid Database (LSM-Vec) — embed or self-host the engine · Reference & Deployment.