Indexing Engine
SymbolicAI supports multiple indexing engines for vector search and RAG (Retrieval-Augmented Generation) operations. This document covers both the default naive vector engine and the production-ready Qdrant engine.
Naive Vector Engine (Default)
By default, text indexing and retrieval is performed with the local naive vector engine using the Interface abstraction:
from symai.interfaces import Interface
db = Interface('naive_vectordb', index_name="my_index")
db("Hello world", operation="add")
result = db("Hello", operation="search", top_k=1)
print(result.value) # most relevant matchYou can also add or search multiple documents at once, and perform save/load/purge operations:
docs = ["Alpha document", "Beta entry", "Gamma text"]
db = Interface('naive_vectordb', index_name="my_index")
db(docs, operation="add")
db("save", operation="config")
# Load or purge as neededQdrant RAG Engine
The Qdrant engine provides a production-ready vector database for scalable RAG applications. It supports both local and cloud deployments, advanced document chunking, and comprehensive collection management.
Setup
Option 1: Local Qdrant Server (via symserver)
Start Qdrant using the symserver CLI (Docker by default).
Performance & security flags
symserver forwards these directly to Qdrant as QDRANT__* environment variables:
--max-workers N
QDRANT__SERVICE__MAX_WORKERS
Parallel HTTP request workers (0 = auto)
--max-search-threads N
QDRANT__STORAGE__PERFORMANCE__MAX_SEARCH_THREADS
Search threads per request (0 = auto)
--api-key KEY
QDRANT__SERVICE__API_KEY
Require bearer key on every request
--read-only-api-key KEY
QDRANT__SERVICE__READ_ONLY_API_KEY
Read-only bearer key
--log-level LEVEL
QDRANT__LOG_LEVEL
TRACE / DEBUG / INFO / WARN / ERROR
--disable-telemetry
QDRANT__TELEMETRY_DISABLED
Opt out of usage telemetry
--snapshots-path PATH
QDRANT__STORAGE__SNAPSHOTS_PATH
Directory for Qdrant snapshots
--enable-tls
QDRANT__SERVICE__ENABLE_TLS
Enable HTTPS on REST and gRPC
--tls-cert PATH
QDRANT__TLS__CERT
TLS certificate file (Docker: directory is auto-mounted)
--tls-key PATH
QDRANT__TLS__KEY
TLS private key file
--tls-ca-cert PATH
QDRANT__TLS__CA_CERT
TLS CA certificate file
--set KEY=VALUE
QDRANT__KEY=VALUE
Generic passthrough for any QDRANT__* var
Preferred: Local Qdrant + RAG API layer (via symserver)
SymbolicAI can also run a Qdrant RAG API (FastAPI served by uvicorn) alongside Qdrant. This is the preferred setup when you want:
remote clients (curl / JS / other services) to use your index over HTTP
a stable endpoint surface (
/search,/chunk-upsert,/collections, etc.)performance scaling via multiple uvicorn workers (set
--rag-workers/RAG_API_WORKERS)
Prerequisites (install the needed extras):
Start Qdrant + the RAG API in one command:
Common flags (RAG API):
--rag-host 0.0.0.0(default:RAG_API_HOSTor0.0.0.0)--rag-port 8080(default:RAG_API_PORTor8080)--rag-workers 4(default:RAG_API_WORKERSor1)--rag-token secret(default:RAG_API_TOKENor empty = no auth)--rag-reload(dev only; or setUVICORN_RELOAD=1)
Example with explicit ports + token:
Calling the RAG API (HTTP)
Once running, you can interact with Qdrant over HTTP through the RAG API. If --rag-token / RAG_API_TOKEN is set, send it as a bearer token:
The RAG API is implemented by symai.server.qdrant_rag_api and mirrors the extensity-rag endpoint surface: /healthz, /search, /chunk-upsert, /points (upsert/delete), /retrieve, and /collections (list/create/inspect/delete).
Option 2: Cloud Qdrant
Configure your cloud Qdrant instance:
Basic Usage
The Qdrant engine is used directly via the QdrantIndexEngine class:
Local Search with citations
If you need citation-formatted results compatible with parallel.search, use the local_search interface. It embeds the query locally, queries Qdrant, and returns a SearchResult (with value and citations) instead of raw ScoredPoint objects:
Local search accepts the same args as passed to Qdrant directly: collection_name/index_name, limit/top_k/index_top_k, score_threshold, query_filter (dict or Qdrant Filter), and any extra Qdrant search kwargs. For convenience, metadata and where are accepted as aliases for query_filter when you want to filter by payload metadata.
For convenience, SymbolicAI also supports a dict-based filter shorthand for both local_search and QdrantIndexEngine.search:
scalar values (e.g.
"category": "AI") are treated as equality matcheslist/tuple/set values (e.g.
"tags": ["rag", "paper"]) are treated as any-of matches, which is useful for tag membership
Citation fields are derived from Qdrant payloads: the excerpt uses payload["text"] (or content), the URL is resolved from payload["source"]/url/file_path/path and is always returned as an absolute file:// URI (relative inputs resolve against the current working directory), and the title is the stem of that path (PDF titles append #p{page} when a page field is present in the payload). Each matching chunk yields its own citation; multiple citations can point to the same file.
If you want a stable source header for each chunk, store a source_id or chunk_id in the payload (otherwise the Qdrant point id is used).
Chunk provenance: chunk_and_upsert() stores optional chunk-location metadata in each chunk payload:
payload["chunk_start_line"]/payload["chunk_end_line"](plus optional char offsets)payload["chunk_start_page"]/payload["chunk_end_page"]when page breaks can be detected in the extracted text (PDFs only)
For file:// citations, the engine appends a fragment when provenance exists:
non-PDFs:
#L10or#L10-L42PDFs:
#page=N(preferred over line fragments)
Note: PDF page provenance is only available when the PDF extraction output preserves page boundaries (commonly as form-feed \f).
Example:
Collection Management
Create and manage collections programmatically:
Document Chunking and RAG
The Qdrant engine includes built-in document chunking for RAG workflows.
Performance:
chunk_and_upsertembeds all chunks in a single batched API call regardless of how many chunks the document produces. This yields a 10–100× speedup over per-chunk embedding: benchmark results show ~8× end-to-end on a 16-chunk document and ~28–116× on pure embedding throughput (8 texts), depending on network conditions.
Point Operations
For fine-grained control over individual vectors:
Existence / Counting (Documents, Tags)
Qdrant stores chunks as points; document-level operations are implemented by storing a stable identifier in each chunk payload (by default, chunk_and_upsert() uses payload["source"]).
Notes:
engine.count(...)counts points (chunks). It’s fast and uses Qdrant’s native count API.engine.count_documents_for_tag(...)counts unique documents (uniquepayload["source"]values) and may scan points (viascroll()), so it can be slower on very large collections.
Configuration Options
The Qdrant engine supports extensive configuration:
Environment Variables
Configure Qdrant via environment variables:
Embedding Model & API Key Behavior
If
EMBEDDING_ENGINE_API_KEYis empty ("", the default), SymbolicAI will use a local, lightweight embedding engine based on SentenceTransformers. You can specify any supported model name viaEMBEDDING_ENGINE_MODEL(e.g."all-mpnet-base-v2").If you DO provide an
EMBEDDING_ENGINE_API_KEY, then the respective remote embedding engine will be used (e.g. OpenAI). The model is selected according to theEMBEDDING_ENGINE_MODELkey where applicable.
This allows you to easily experiment locally for free, and switch to more powerful cloud backends when ready.
Installation
Install Qdrant support using the package extra (recommended):
This installs all required dependencies:
qdrant-client- Qdrant Python clientchonkie[all]- Document chunking librarytokenizers- Tokenization support
Alternatively, install dependencies individually:
See Also
See
tests/engines/index/test_qdrant_engine.pyfor comprehensive usage examplesQdrant documentation: https://qdrant.tech/documentation/
Last updated