Indexing Engine
SymbolicAI supports multiple indexing engines for vector search and RAG (Retrieval-Augmented Generation) operations. This document covers both the default naive vector engine and the production-ready Qdrant engine.
Naive Vector Engine (Default)
By default, text indexing and retrieval is performed with the local naive vector engine using the Interface abstraction:
from symai.interfaces import Interface
db = Interface('naive_vectordb', index_name="my_index")
db("Hello world", operation="add")
result = db("Hello", operation="search", top_k=1)
print(result.value) # most relevant matchYou can also add or search multiple documents at once, and perform save/load/purge operations:
docs = ["Alpha document", "Beta entry", "Gamma text"]
db = Interface('naive_vectordb', index_name="my_index")
db(docs, operation="add")
db("save", operation="config")
# Load or purge as neededQdrant RAG Engine
The Qdrant engine provides a production-ready vector database for scalable RAG applications. It supports both local and cloud deployments, advanced document chunking, and comprehensive collection management.
Setup
Option 1: Local Qdrant Server (via symserver)
Start Qdrant using the symserver CLI (Docker by default).
Option 2: Cloud Qdrant
Configure your cloud Qdrant instance:
Basic Usage
The Qdrant engine is used directly via the QdrantIndexEngine class:
Local Search with citations
If you need citation-formatted results compatible with parallel.search, use the local_search interface. It embeds the query locally, queries Qdrant, and returns a SearchResult (with value and citations) instead of raw ScoredPoint objects:
Local search accepts the same args as passed to Qdrant directly: collection_name/index_name, limit/top_k/index_top_k, score_threshold, query_filter (dict or Qdrant Filter), and any extra Qdrant search kwargs. Citation fields are derived from Qdrant payloads: the excerpt uses payload["text"] (or content), the URL is resolved from payload["source"]/url/file_path/path and is always returned as an absolute file:// URI (relative inputs resolve against the current working directory), and the title is the stem of that path (PDF pages append #p{page} when provided). Each matching chunk yields its own citation; multiple citations can point to the same file.
If you want a stable source header for each chunk, store a source_id or chunk_id in the payload (otherwise the Qdrant point id is used).
Example:
Collection Management
Create and manage collections programmatically:
Document Chunking and RAG
The Qdrant engine includes built-in document chunking for RAG workflows:
Point Operations
For fine-grained control over individual vectors:
Configuration Options
The Qdrant engine supports extensive configuration:
Environment Variables
Configure Qdrant via environment variables:
Embedding Model & API Key Behavior
If
EMBEDDING_ENGINE_API_KEYis empty ("", the default), SymbolicAI will use a local, lightweight embedding engine based on SentenceTransformers. You can specify any supported model name viaEMBEDDING_ENGINE_MODEL(e.g."all-mpnet-base-v2").If you DO provide an
EMBEDDING_ENGINE_API_KEY, then the respective remote embedding engine will be used (e.g. OpenAI). The model is selected according to theEMBEDDING_ENGINE_MODELkey where applicable.
This allows you to easily experiment locally for free, and switch to more powerful cloud backends when ready.
Installation
Install Qdrant support using the package extra (recommended):
This installs all required dependencies:
qdrant-client- Qdrant Python clientchonkie[all]- Document chunking librarytokenizers- Tokenization support
Alternatively, install dependencies individually:
See Also
See
tests/engines/index/test_qdrant_engine.pyfor comprehensive usage examplesQdrant documentation: https://qdrant.tech/documentation/
Last updated