Local Engine
Local Neuro-Symbolic Engine
You can use a locally hosted instance for the Neuro-Symbolic Engine. We build on top of:
huggingface/transformers through a custom FastAPI server.
llama.cpp backend
For instance, let's suppose you want to set up the Neuro-Symbolic Engine with the gpt-oss-120b model. Download the GGUF shards you need (e.g. the Q4_1 variant).
With symai, first set the NEUROSYMBOLIC_ENGINE_MODEL to llamacpp:
{
"NEUROSYMBOLIC_ENGINE_API_KEY": "",
"NEUROSYMBOLIC_ENGINE_MODEL": "llamacpp",
...
}You can then run the server in two ways:
Using Python bindings:
symserver --env python --model ./llama-pro-8b-instruct.Q4_K_M.gguf --n_gpu_layers -1 --chat_format llama-3 --port 8000 --host localhostUsing C++ server directly:
symserver --env cpp --cpp-server-path /path/to/llama.cpp/llama-server -ngl -1 -m gpt-oss-120b/Q4_1/gpt-oss-120b-Q4_1-00001-of-00002.gguf -fa 'on' -b 8092 -ub 1024 --port 8000 --host localhost -c 0 -n 4096 -t 14 --jinjaTo see all available options, run:
symserver --env python --help # for Python bindings
symserver --env cpp --cpp-server-path /path/to/llama.cpp/llama-server --help # for C++ serverThe Neuro-Symbolic Engine now supports tool execution and structured JSON responses out of the box. For concrete examples, review the tests in tests/engines/neurosymbolic/test_nesy_engine.py::test_tool_usage and tests/contract/test_contract.py.
HuggingFace backend
Let's suppose we want to use dolphin-2.9.3-mistral-7B-32k from HuggingFace. First, download the model with the HuggingFace CLI:
huggingface-cli download cognitivecomputations/dolphin-2.9.3-mistral-7B-32k --local-dir ./dolphin-2.9.3-mistral-7B-32kFor the HuggingFace server, you have to set the NEUROSYMBOLIC_ENGINE_MODEL to huggingface:
{
"NEUROSYMBOLIC_ENGINE_API_KEY": "",
"NEUROSYMBOLIC_ENGINE_MODEL": "huggingface",
...
}Then, run symserver with the following options:
symserver --model ./dolphin-2.9.3-mistral-7B-32k --attn_implementation flash_attention_2To see all the available options we support for HuggingFace, run:
symserver --helpNow you are set to use the local engine.
# do some symbolic computation with the local engine
sym = Symbol('Kitties are cute!').compose()
print(sym)
# :Output:
# Kittens are known for their adorable nature and fluffy appearance, making them a favorite addition to many homes across the world. They possess a strong bond with
# their owners, providing companionship and comfort that can ease stress and anxiety. With their playful personalities, they are often seen as a symbol of happiness
# and joy, and their unique characteristics such as purring, kneading, and head butts bring warmth to our hearts. Cats also have a natural instinct to groom, which
# helps them maintain their clean and soft fur. Not only do they bring comfort and love to their owners, but they also have some practical benefits, such as reducing
# allergens, deterring pests, and even reducing stress in their surroundings. Overall, it is no surprise that pets have a long history of providing both emotional
# and physical comfort and happiness to their owners, making them a much-loved member of families around the world.Local Embedding Engine
You can also use local embedding models through the llama.cpp backend. First, set the EMBEDDING_ENGINE_MODEL to llamacpp:
{
"EMBEDDING_ENGINE_API_KEY": "",
"EMBEDDING_ENGINE_MODEL": "llamacpp",
...
}For instance, to use the Nomic embed text model, first download it:
huggingface-cli download nomic-ai/nomic-embed-text-v1.5-GGUF nomic-embed-text-v1.5.Q8_0.gguf --local-dir .Then start the server with embedding-specific parameters using either:
Python bindings:
symserver --env python --model nomic-embed-text-v1.5.Q8_0.gguf --embedding True --n_ctx 2048 --rope_scaling_type 2 --rope_freq_scale 0.75 --n_batch 32 --port 8000 --host localhostC++ server:
symserver --env cpp --cpp-server-path /path/to/llama.cpp/llama-server -ngl -1 -m nomic-embed-text-v1.5.Q8_0.gguf --embedding -b 8092 -ub 1024 --port 8000 --host localhost -t 14 --mlock --no-mmapThe server supports batch processing for embeddings. Here's how to use it with symai:
from symai import Symbol
# Single text embedding
some_text = "Hello, world!"
embedding = Symbol(some_text).embed() # returns a list (1 x dim)
# Batch processing
some_batch_of_texts = ["Hello, world!"] * 32
embeddings = Symbol(some_batch_of_texts).embed() # returns a list (32 x 1 x dim)Last updated