In-memory vector store with multi-metric similarity search.
pip install philiprehberger-embedding-store
In-memory vector store with multi-metric similarity search.
pip install philiprehberger-embedding-store
from philiprehberger_embedding_store import VectorStore
store = VectorStore(dimensions=1536)
# Add vectors with metadata
store.add("doc1", embedding=[0.1, 0.2, ...], metadata={"title": "First doc"})
store.add("doc2", embedding=[0.3, 0.1, ...], metadata={"title": "Second doc"})
# Search by similarity
results = store.search(query_embedding=[0.15, 0.18, ...], top_k=5)
for result in results:
print(f"{result.id}: score={result.score:.3f}, {result.metadata}")
Choose a metric per store or override per search call:
from philiprehberger_embedding_store import VectorStore
# Set default metric at store level
store = VectorStore(dimensions=128, metric="euclidean")
results = store.search(query, top_k=5)
# Override metric for a single search
results = store.search(query, top_k=5, metric="manhattan")
Supported metrics: "cosine" (default), "dot", "euclidean", "manhattan".
from philiprehberger_embedding_store import VectorStore
store = VectorStore()
store.add("d1", [1.0, 0.0], {"category": "docs", "lang": "en"})
store.add("d2", [0.9, 0.1], {"category": "code", "lang": "en"})
# Filter by single field
results = store.search(query, filter=lambda m: m["category"] == "docs")
# Filter by multiple conditions
results = store.search(
query,
filter=lambda m: m["category"] == "docs" and m["lang"] == "en",
)
from philiprehberger_embedding_store import VectorStore
store = VectorStore()
# Add many vectors at once
store.add_many([
("id1", [0.1, 0.2], {"label": "first"}),
("id2", [0.3, 0.4], {"label": "second"}),
])
# Search with multiple queries at once
all_results = store.search_many(
[query_embedding_1, query_embedding_2],
top_k=5,
)
Use score() to compute the similarity between a stored entry and an arbitrary query vector without running a full top-k search — handy for re-ranking or one-off comparisons.
from philiprehberger_embedding_store import VectorStore
store = VectorStore(metric="cosine")
store.add("doc1", [1.0, 0.0, 0.0])
store.score("doc1", [1.0, 0.0, 0.0]) # 1.0
store.score("doc1", [0.0, 1.0, 0.0]) # ~0.0
store.score("doc1", [1.0, 1.0, 1.0], metric="dot") # 1.0
from philiprehberger_embedding_store import VectorStore
store = VectorStore()
store.add("doc1", [0.1, 0.2], {"title": "Example"})
# Save to disk
store.save("vectors.json")
# Load from disk
loaded = VectorStore.load("vectors.json")
from philiprehberger_embedding_store import VectorStore
store = VectorStore()
store.add("a", [1.0, 0.0])
store.remove("a") # Remove by ID
store.clear() # Remove all entries
from philiprehberger_embedding_store import VectorStore
store = VectorStore(dimensions=3)
store.add("a", [1.0, 0.0, 0.0], {"version": 1})
# Replace the vector in place
store.update("a", vector=[0.0, 1.0, 0.0])
# Replace the metadata (wholesale)
store.update("a", metadata={"version": 2})
# Update both at once
store.update("a", vector=[0.0, 0.0, 1.0], metadata={"version": 3})
# Remove everything but keep the dimensionality (3) and metric configuration
store.clear()
assert len(store) == 0
store.add("b", [0.1, 0.2, 0.3]) # still constrained to 3 dimensions
| Function / Class | Description |
|---|---|
VectorStore(dimensions, metric?) | Create a store with optional dimensionality and metric |
add(id, embedding, metadata?) | Add a vector with optional metadata |
add_many(items) | Batch add multiple vectors |
search(query, top_k?, metric?, filter?, min_score?) | Similarity search |
search_many(queries, top_k?, metric?, filter?, min_score?) | Batch similarity search |
score(id, query, metric?) | Compute similarity between a stored entry and a query vector |
get(id) | Get entry by ID |
delete(id) | Delete entry by ID |
remove(id) | Remove entry by ID (alias for delete) |
update_metadata(id, metadata) | Update metadata for an entry |
update(id, vector=None, metadata=None) | Replace an entry's vector and/or metadata in place |
save(path) | Save store to JSON file |
VectorStore.load(path) | Load store from JSON file |
clear() | Remove all entries (preserves dimensionality and metric) |
ids() | List all stored IDs |
len(store) | Number of entries |
id in store | Check if ID exists |
store.size | Number of entries (property) |
store.metric | Current distance metric (property) |
pip install -e .
python -m pytest tests/ -v
If you find this project useful: