Local Engine
Local Neuro-Symbolic Engine
You can use a locally hosted instance for the Neuro-Symbolic Engine. We build on top of:
huggingface/transformers through a custom FastAPI server.
llama.cpp backend
For instance, let's suppose you want to set up the Neuro-Symbolic Engine with the gpt-oss-120b model. Download the GGUF shards you need (e.g. the Q4_1 variant).
With symai, first set the NEUROSYMBOLIC_ENGINE_MODEL to llamacpp:
{
"NEUROSYMBOLIC_ENGINE_API_KEY": "",
"NEUROSYMBOLIC_ENGINE_MODEL": "llamacpp",
...
}You can then run the server in two ways:
Using Python bindings:
Using C++ server directly:
To see all available options, run:
The Neuro-Symbolic Engine now supports tool execution and structured JSON responses out of the box. For concrete examples, review the tests in tests/engines/neurosymbolic/test_nesy_engine.py::test_tool_usage and tests/contract/test_contract.py.
HuggingFace backend
Let's suppose we want to use dolphin-2.9.3-mistral-7B-32k from HuggingFace. First, download the model with the HuggingFace CLI:
For the HuggingFace server, you have to set the NEUROSYMBOLIC_ENGINE_MODEL to huggingface:
Then, run symserver with the following options:
To see all the available options we support for HuggingFace, run:
Now you are set to use the local engine.
Local Embedding Engine
You can also use local embedding models through the llama.cpp backend. First, set the EMBEDDING_ENGINE_MODEL to llamacpp:
For instance, to use the Nomic embed text model, first download it:
Then start the server with embedding-specific parameters using either:
Python bindings:
C++ server:
The server supports batch processing for embeddings. Here's how to use it with symai:
Last updated