Usage Guide
===========

Embenx is designed to be simple for prototyping yet robust enough for research-grade agentic memory. This guide covers core retrieval and serialization.

Core Retrieval
--------------

The primary interface is the ``Collection`` class. It provides a table-like abstraction for vectors and metadata.

.. code-block:: python

   from embenx import Collection
   import numpy as np

   # 1. Initialize with a specific backend
   # Options: 'faiss-hnsw', 'scann', 'usearch', 'pgvector', 'duckdb', etc.
   col = Collection(dimension=768, indexer_type="faiss-hnsw")

   # 2. Add data
   # Vectors can be numpy arrays or lists
   vectors = np.random.rand(100, 768).astype('float32')
   metadata = [{"id": i, "text": f"Document {i}", "tag": "test"} for i in range(100)]
   col.add(vectors, metadata)

   # 3. Basic Search
   # Returns a list of (metadata, distance) tuples
   results = col.search(query_vector, top_k=5)

   # 4. Metadata Filtering
   # Supports exact match dictionary filters across any indexed field
   results = col.search(query_vector, top_k=5, where={"tag": "test"})

   # 5. Serialization
   # Saves to a portable Parquet file containing both vectors and metadata
   col.to_parquet("my_memory.parquet")
   
   # Load back
   new_col = Collection.from_parquet("my_memory.parquet")

Advanced Retrieval Features
--------------------------

Matryoshka Truncation
~~~~~~~~~~~~~~~~~~~~~

If you are using Matryoshka Representation Learning (MRL) models, you can truncate dimensions for 10x faster retrieval with minimal accuracy loss.

.. code-block:: python

   # Define a collection that truncates 768-dim embeddings to 128
   col = Collection(dimension=768, truncate_dim=128)
   
   # Input vectors are still expected to be 768-dim; truncation happens internally
   col.add(full_vectors, metadata)
   results = col.search(full_query_vector)

Hybrid Search (Dense + Sparse)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Combine semantic vector search with keyword-based BM25 retrieval using Reciprocal Rank Fusion (RRF).

.. code-block:: python

   # Initialize with a sparse indexer
   col = Collection(dimension=768, sparse_indexer_type="bm25")
   
   # Perform hybrid search
   results = col.hybrid_search(
       query_vector=q_vec,
       query_text="fox",
       dense_weight=0.5,
       sparse_weight=0.5
   )

Reranking
~~~~~~~~~

Improve precision by re-scoring top candidates with a Cross-Encoder or FlashRank.

.. code-block:: python

   from embenx.rerank import RerankHandler
   
   # Use FlashRank (CPU-optimized)
   ranker = RerankHandler(model_name="ms-marco-TinyBERT-L-2-v2", model_type="flashrank")
   
   # Search with reranking hook
   results = col.search(query_vector, top_k=5, reranker=ranker, query_text="My original question")

Evaluation & Benchmarking
-------------------------

Embenx makes it easy to measure the performance of different indexers on your own data.

.. code-block:: python

   # Measure Recall@10 against an exact search baseline
   metrics = col.evaluate(indexer_type="faiss-hnsw", top_k=10)
   print(f"Recall: {metrics['recall']}, Latency: {metrics['latency_ms']}ms")

   # Benchmark multiple indexers side-by-side
   col.benchmark(indexers=["faiss", "usearch", "hnswlib"])

Synthetic Data Generation
-------------------------

Embenx allows you to generate high-quality synthetic query-document pairs from your collections using LLMs. This is useful for creating fine-tuning datasets or evaluation benchmarks.

.. code-block:: python

   # 1. Generate queries using LiteLLM (v1.83.0+)
   # Supports GPT-4, Claude, Gemini, etc.
   results = col.generate_synthetic_queries(
       text_key="text",
       n_queries_per_doc=2,
       num_docs=100,
       model="gpt-4o-mini"
   )

   # 2. Use a local LLM (Ollama)
   # Requires running: ollama run llama3
   results = col.generate_synthetic_queries(
       model="ollama/llama3",
       api_base="http://localhost:11434",
       output_path="training_data.parquet"
   )

   # 3. Export to JSONL or CSV
   col.generate_synthetic_queries(
       n_queries_per_doc=1,
       output_path="eval_bench.jsonl"
   )