Asteroid Cloud

Asteroid Cloud is a memory-friendly vector database for applications that need fast similarity search at lower infrastructure cost. You bring embeddings; Asteroid stores vectors and JSON metadata, then serves k-NN search with optional metadata filters.

In the pilot, you access your Asteroid instance over HTTPS with an API key. Use the Python client or call the HTTP API directly from any language.

Your endpoint and API key are provided when your pilot is set up. You choose the index dimension yourself — see Create your index. Don't have them yet? Request a pilot.

Before you start

For your pilot instance, we provide:

an HTTPS endpoint
an API key

You bring:

the index dimension + metric (you set it once with create_index)
embeddings from your model or embedding provider
numeric vector IDs
optional JSON metadata for filtering and retrieval

One pilot instance is one vector index. Insert vectors, attach metadata, and search against that index over the API.

Quickstart

Install the client, connect, create your index, insert vectors, and search. The example below uses 3-dimensional toy vectors so you can copy and run it directly. Pick one path: bring your own vectors (below) or bring text (next section) — each creates an index at its own dimension, and an instance holds one index at a time.

Install the Python client (requires Python 3.9+):

pip install lsmvec-client

from lsmvec_client import Client

client = Client(api_key="sk-live-...",
                base_url="https://api.lsmvec.com")
assert client.health()

# Create the index once, at your embedding dimension (3 here for the toy vectors).
client.create_index(dim=3, metric="l2")

client.insert(1, [0.10, 0.20, 0.30], metadata={"category": "docs"})
client.insert(2, [0.12, 0.18, 0.33], metadata={"category": "blog"})

hits = client.search([0.11, 0.19, 0.31], k=2)
for h in hits:
    print(h.id, h.distance)

# create the index once
curl -sX PUT https://api.lsmvec.com/v1/index \
  -H "Authorization: Bearer sk-live-..." \
  -H "Content-Type: application/json" \
  -d '{"dim": 3, "metric": "l2"}'

# insert
curl -sX POST https://api.lsmvec.com/v1/vectors \
  -H "Authorization: Bearer sk-live-..." \
  -H "Content-Type: application/json" \
  -d '{"id": 1, "vector": [0.10, 0.20, 0.30], "metadata": {"category": "docs"}}'

# search
curl -sX POST https://api.lsmvec.com/v1/search \
  -H "Authorization: Bearer sk-live-..." \
  -H "Content-Type: application/json" \
  -d '{"vector": [0.11, 0.19, 0.31], "k": 2}'

One dimension per index. The toy vectors above are 3-dimensional for readability. Every inserted and searched vector must match the dimension you set with create_index. To switch dimensions (e.g. for the text path below), call delete_index() first, then create_index() again — see Create your index.

Storing text instead of vectors

Storing text? Install the embedding extra. The Python client embeds text before sending vectors to Asteroid.

A separate, 384-d index. This path uses a dim=384 index (bge-small's output). Run create_index(dim=384, metric="l2") first. This is a different index from the toy-vector Quickstart above — if you already created one at another dimension in this instance, call delete_index() first.

pip install "lsmvec-client[embed]"

from lsmvec_client import Client
from lsmvec_client.ingest import LocalEmbedder, ingest_text, search_text

client   = Client(api_key="sk-live-...", base_url="https://api.lsmvec.com")
client.create_index(dim=384, metric="l2")            # once, before any write
embedder = LocalEmbedder("BAAI/bge-small-en-v1.5")   # 384-d, runs locally

ingest_text(client, "doc-1", "Refunds are processed within 30 days.", embedder, start_id=0)
hits = search_text(client, "how long do refunds take?", embedder, k=3)
print(hits[0].text)

Full flow in Embeddings.

Create your index

Set your index's dimension and distance metric once, before your first write. The dimension must match your embedding model's output (e.g. 384 for bge-small-en-v1.5, 1536 for OpenAI text-embedding-3-small). Supported range is 1–4000, which covers common embedding sizes (384, 768, 1024, 1536, 3072). Pick the metric — l2 or cosine — to match how your embeddings were trained.

client.create_index(dim=384, metric="l2")   # "l2" or "cosine"

curl -sX PUT https://api.lsmvec.com/v1/index \
  -H "Authorization: Bearer sk-live-..." -H "Content-Type: application/json" \
  -d '{"dim": 384, "metric": "cosine"}'

The dimension is fixed once set. Inserting or searching before you create the index returns 409 index_not_initialized; calling create_index again returns 409 index_already_initialized.

Check the current state

client.get_index()   # {"initialized": True, "dim": 384, "metric": "l2"}

curl -s https://api.lsmvec.com/v1/index -H "Authorization: Bearer sk-live-..."

Re-dimension (destructive)

There is no in-place dimension change. To switch dimensions, delete the index and create a new one. This permanently removes all vectors and payloads — there is no undo.

client.delete_index()                          # wipe -> uninitialized
client.create_index(dim=768, metric="cosine")  # re-create at a new dimension

curl -sX DELETE https://api.lsmvec.com/v1/index \
  -H "Authorization: Bearer sk-live-..."

Insert & upsert vectors

insert adds a vector by ID with optional JSON metadata. upsert inserts or replaces the vector for an existing ID.

Use sequential IDs. Asteroid keys vectors by sequential, dense integer IDs — assign them in order starting at 0 (0, 1, 2, …). If your records are keyed by hashes, timestamps, or other large/sparse values, keep that mapping in your application and hand Asteroid a sequential counter; large or sparse IDs are not recommended on the pilot.

client.insert(1, vector, metadata={"title": "intro", "year": 2026})
client.upsert(1, new_vector)   # insert or replace the vector for ID 1

# insert (POST)
curl -sX POST https://api.lsmvec.com/v1/vectors \
  -H "Authorization: Bearer sk-live-..." -H "Content-Type: application/json" \
  -d '{"id": 1, "vector": [0.10, 0.20, 0.30], "metadata": {"title": "intro", "year": 2026}}'

# upsert (PUT) — replace the vector for id 1
curl -sX PUT https://api.lsmvec.com/v1/vectors/1 \
  -H "Authorization: Bearer sk-live-..." -H "Content-Type: application/json" \
  -d '{"vector": [...]}'

Fetch & delete

rec = client.get(1)       # {"id": 1, "vector": [...]}
client.delete(1)

curl -s   https://api.lsmvec.com/v1/vectors/1 -H "Authorization: Bearer sk-live-..."
curl -sX DELETE https://api.lsmvec.com/v1/vectors/1 -H "Authorization: Bearer sk-live-..."

Vectors are stored with SQ8 compression. get returns a dequantized vector, so values may differ slightly from the original input.

Metadata & payloads

Attach JSON metadata to each vector: document text, source, tags, timestamps, tenant IDs, or any fields your app needs for retrieval. Read, replace, or merge metadata without changing the vector.

client.get_payload(1)                       # -> {...}
client.set_payload(1, {"category": "docs"}) # replace
client.merge_payload(1, {"views": 42})      # merge-patch

curl -s   https://api.lsmvec.com/v1/vectors/1/payload -H "Authorization: Bearer sk-live-..."
curl -sX PUT   https://api.lsmvec.com/v1/vectors/1/payload -H "Authorization: Bearer sk-live-..." \
  -H "Content-Type: application/json" -d '{"category": "docs"}'
curl -sX PATCH https://api.lsmvec.com/v1/vectors/1/payload -H "Authorization: Bearer sk-live-..." \
  -H "Content-Type: application/json" -d '{"views": 42}'

On an existing id, a non-empty payload replaces the stored payload, an explicit empty {} clears it, and omitting metadata on insert/upsert leaves the existing payload unchanged. set_payload always replaces; merge_payload applies an RFC-7396 merge-patch (a null value deletes a key).

Search

Search returns the nearest vector IDs by distance. Increase ef_search when you want higher recall and can spend more time per query.

hits = client.search(query_vector, k=10, ef_search=128)
for h in hits:
    print(h.id, h.distance)

curl -sX POST https://api.lsmvec.com/v1/search \
  -H "Authorization: Bearer sk-live-..." -H "Content-Type: application/json" \
  -d '{"vector": [...], "k": 10, "ef_search": 128}'

Filter by metadata

Combine vector search with metadata filters in one query. Supported operators:

Comparison: $eq $ne $gt $gte $lt $lte
Membership: $in $nin $contains_any $contains_all
Existence: $exists (boolean)
Logical: $and $or

Use case	Filter
Match one field	`{"source": {"$eq": "handbook"}}`
Filter by time or version	`{"year": {"$gte": 2026}}`
Match one of several tags	`{"tag": {"$in": ["docs", "support"]}}`
Combine conditions	`{"$and": [{"category": {"$eq": "docs"}}, {"year": {"$gte": 2026}}]}`

hits = client.search(
    query_vector, k=10,
    filter={"$and": [
        {"category": {"$eq": "docs"}},
        {"year": {"$gte": 2026}},
    ]},
)

curl -sX POST https://api.lsmvec.com/v1/search \
  -H "Authorization: Bearer sk-live-..." -H "Content-Type: application/json" \
  -d '{"vector": [...], "k": 10,
       "filter": {"$and": [{"category": {"$eq": "docs"}},
                           {"year": {"$gte": 2026}}]}}'

Embeddings

The Python client embeds text before sending vectors to Asteroid. It chunks documents, embeds each chunk, and stores the vectors with payloads.

For raw documents, start here with ingest_text. Use Bulk & batch import only when you already have vectors.

pip install "lsmvec-client[embed]"

Connect

from lsmvec_client import Client
client = Client(api_key="sk-live-...",
                base_url="https://api.lsmvec.com")   # your pilot endpoint
assert client.health()
client.create_index(dim=384, metric="l2")   # once per index; bge-small is 384-d

If this instance already has an index at another dimension, call delete_index() first — see Create your index.

Choose an embedder

Local embedding model (default). Runs in the Python client with FastEmbed.

from lsmvec_client.ingest import LocalEmbedder
embedder = LocalEmbedder("BAAI/bge-small-en-v1.5")   # 384-d

Embedding provider API (alternative). Calls an OpenAI-compatible embeddings endpoint.

from lsmvec_client.ingest import ProviderEmbedder
embedder = ProviderEmbedder(
    dim=1536, api_key="...",
    model="text-embedding-3-small")

Dimension must match. Your index dimension must match the embedder output dimension (bge-small-en-v1.5 outputs 384). Set create_index(dim=…) to that dimension.

Ingest a document

from pathlib import Path
from lsmvec_client.ingest import ingest_text

next_id, doc_chunks = 0, {}          # see the persistence note below
ids = ingest_text(client, "handbook-2026", Path("handbook.md").read_text(),
                  embedder, start_id=next_id,
                  payload={"source": "handbook", "year": 2026})
next_id += len(ids)
doc_chunks["handbook-2026"] = ids

ingest_text chunks the text, embeds each chunk, and stores its vector plus a payload {doc_id, chunk_index, text, model, …your fields}; it returns the assigned ids.

Persist your id counter. Asteroid ids are a contiguous range you allocate via start_id. Save next_id and the doc_id → ids map in your own database — if you restart ingestion from start_id=0 you will overwrite the vectors you stored before.

Ingest multiple documents

from pathlib import Path

for path in sorted(Path("corpus").glob("*.md")):
    doc_id = path.stem
    ids = ingest_text(
        client,
        doc_id,
        path.read_text(),
        embedder,
        start_id=next_id,
        payload={"source": str(path)},
    )
    next_id += len(ids)
    doc_chunks[doc_id] = ids

Store next_id and doc_chunks in your own database before the next run.

For larger imports, keep using the same id counter. If you already have embeddings, use insert_batch.

Retrieve

from lsmvec_client.ingest import search_text

hits = search_text(client, "what is the refund policy?", embedder,
                   k=5, filter={"source": {"$eq": "handbook"}})
context = "\n".join(h.text for h in hits)
# answer = your_llm(question, context)

search_text embeds the query, runs filtered k-NN, and returns hits enriched with the stored payload: h.text, h.doc_id, h.chunk_index, h.payload.

Update or delete a document

from lsmvec_client.ingest import delete_doc
delete_doc(client, doc_chunks["handbook-2026"])   # remove the doc's chunks
# re-ingest with fresh ids to update

Document updates are delete-then-ingest. Chunk counts can change, so do not assume old ids still match the new document.

Chunking

Control chunk size and overlap via max_tokens (default 512) and overlap_tokens (default 64) on ingest_text.

Asteroid does not embed text server-side. Embedding happens in the Python client, using either a local model or your embedding provider API.

Bulk & batch import

There are two build modes. Choose based on whether the index is empty.

Bulk build (in-memory build). The first load of a new, empty index. It loads all vectors at once, so it is the fastest initial load and produces a higher-quality index. (For very large loads we'll help size the pilot.)
Incremental build. An existing index, streaming data, or ongoing updates. It writes vectors through insert or insert_batch.

Bulk build (in-memory build)

bulk_build builds a new, empty index from all vectors at once — use it for the fastest initial load.

import numpy as np
vectors = np.random.rand(100_000, 128).astype(np.float32)
report = client.bulk_build(vectors, payloads=[{"source": "handbook"} for _ in vectors], threads=4)
print(report)   # {'n':…, 'vectors_per_sec':…, 'payloads_written':…}

# raw float32 body; N and dim in headers
curl -sX POST https://api.lsmvec.com/v1/build/bulk \
  -H "Authorization: Bearer sk-live-..." \
  -H "X-LSMVec-N: 100000" -H "X-LSMVec-Dim: 128" \
  -H "Content-Type: application/octet-stream" \
  --data-binary @vectors.f32

Initial-load only. bulk_build requires an empty index and assigns ids in input order, starting at 0. It can attach payloads, but it cannot keep existing ids.

Incremental build

Incremental build adds vectors through normal writes. Use it when the index already has data, when memory is limited, or when vectors arrive over time.

insert writes one vector (see Insert & upsert vectors).
insert_batch writes many vectors in fewer HTTP requests. Use it when you already have embeddings and want to keep your own ids and payloads.

items = [(i, vec, {"source": "handbook"}) for i, vec in enumerate(vectors)]
n = client.insert_batch(items, chunk_size=1000)   # one request per chunk

curl -sX POST https://api.lsmvec.com/v1/vectors/batch \
  -H "Authorization: Bearer sk-live-..." -H "Content-Type: application/json" \
  -d '{"items":[{"id":0,"vector":[...],"metadata":{"source":"handbook"}}]}'

Choose a build mode

New, empty index → bulk_build.
Existing index, streaming data, or ongoing updates → incremental build.
Incremental build with many vectors → insert_batch instead of many single insert calls.

Import existing vectors

If you already have embeddings in another system, you can import them into Asteroid without changing your embedding model.

Export your vectors from the current system.
Use the same embedding dimension when you call create_index.
Insert records with your existing numeric IDs, or use bulk build for a fresh, empty-index load.
Store document text, source, tags, or other retrieval fields as JSON metadata.
Search with the same query embeddings your application already uses.

For pilot workloads, we can help review your current data format and choose between incremental build and bulk build.

Troubleshooting

Symptom	What to check
`401 Unauthorized`	Check that the request includes `Authorization: Bearer <api-key>`.
`400` on insert or search	Check that the vector has the dimension you set with `create_index`.
`409 Conflict` on insert or search	The index has no dimension yet — call `create_index` first. (`409` on `create_index` itself means it is already initialized; `delete_index` to re-dimension.)
`404 Not Found`	The vector ID or payload does not exist yet.
`413 Payload Too Large`	Reduce the request size or split the load into smaller requests.
`429 Rate Limited`	Retry with backoff, or contact us if the pilot needs a higher request rate.
Bulk build fails	Bulk build is for a new, empty index. Use incremental build for existing indexes.

Next: Asteroid Database (AsterVec) — embed or self-host the engine · Reference & Deployment.