File Engine

The file engine reads documents and converts them to text. Plain text files (.txt, .md, .py, .json, etc.) are read directly via native Python I/O. Rich formats (PDF, DOCX, PPTX, XLSX, HTML, EPUB, etc.) are converted to Markdown via markitdownarrow-up-right.

Reading a Single File

Use Symbol.open() to read any supported file into a Symbol:

from symai import Symbol

# Plain text files use the standard backend (default)
text = Symbol('./README.md').open()
print(text.value)  # file contents as a string

# Or pass the path as an argument
text = Symbol().open('./README.md')

Backends

The engine has three backends:

  • auto (default) -- picks the best backend per file type: standard for plain text, structured data, and images; markitdown for rich formats (PDF, DOCX, etc.); tries both for unknown extensions

  • standard -- reads plain text and structured data via native Python I/O, and images via cv2. Raises an error for rich formats

  • markitdown -- converts any supported format to Markdown via markitdown

With the default auto backend, everything just works:

Structured Data Parsing

The standard backend automatically parses structured formats into native Python objects instead of returning raw strings:

Images

The standard backend reads images as RGB numpy arrays via cv2:

Structured Parsing with as_box

By default, structured formats return plain Python objects (dict, list). Pass as_box=True to get a python-boxarrow-up-right object with dot-access instead:

Supported extensions for as_box: .json, .yaml, .yml, .toml, .csv, .tsv.

Batch Reading with FileReader

FileReader is an Expression component for reading multiple files at once. It returns a Symbol whose value is a list of strings (one per file):

Parallel Reading

For large batches, pass workers=N to read files across multiple processes:

Each worker process initializes its own markitdown converter, so there is no shared state to worry about.

Discovering Files

FileReader.get_files() recursively lists supported files in a directory:

Supported Formats

Category
Extensions
Default return type

Plain text (built-in)

.txt, .md, .py, .log, .xml

str

Structured data (built-in)

.json, .yaml, .yml, .toml

dict

Tabular data (built-in)

.csv, .tsv

list[dict]

Images (built-in)

.jpg, .jpeg, .png

numpy.ndarray (RGB)

Rich formats (markitdown)

.pdf, .docx, .pptx, .xlsx, .xls, .html, .htm, .epub, .ipynb, .zip

str (markdown)

Audio (markitdown)

.mp3, .wav, .m4a, .mp4

str (markdown)

LLM-Powered Features

Image files and PowerPoint slides can include LLM-generated descriptions. This routes through SymAI's neurosymbolic engine using your configured NEUROSYMBOLIC_ENGINE_MODEL and API key -- any vision- capable backend works (OpenAI GPT-4o, Anthropic Claude, Google Gemini, etc.). If the configured engine doesn't support vision, these converters still extract metadata (EXIF) and text content without LLM descriptions.

Image Captioning

Opening an image with backend='markitdown' sends it through the configured vision model and returns a Markdown description:

Pass caption_prompt to override the default vision prompt with a custom one:

This works for any file type routed through markitdown (images, PDFs, PPTX, URLs, etc.).

This works with .jpg, .jpeg, and .png files. The vision adapter (_SymaiVisionClient) bridges markitdown's OpenAI-style API with SymAI's engine pipeline, so captioning works regardless of which provider is configured (OpenAI, Anthropic, Google, etc.).

For batch captioning across many images:

Last updated