Configuration Guide

Fabricatio provides a flexible, multi-source configuration system with clear priority order. This guide covers all configuration options and their interactions.

Quick Start Tutorial

This section walks you through setting up Fabricatio for the first time.

Step 1: Install Fabricatio

# Install with full capabilities
pip install fabricatio[full]

# Or with uv
uv add fabricatio[full]

Step 2: Create a Configuration File

Create a fabricatio.toml in your project root:

[debug]
log_level = "INFO"

[llm]
send_to = "base"
max_completion_tokens = 16000
stream = false
temperature = 1.0
top_p = 1.0
timeout = 120

[routing]
providers = [
    { ptype = "OpenAICompatible", key = "sk-your-api-key", name = "openai", base_url = "https://api.openai.com/v1/" }
]

completion_deployments = [
    { id = "openai/gpt-4o-mini", group = 'base', tpm = 100_000, rpm = 1000 }
]

cache_database_path = ".fabricatio.cache.db"

Step 3: Verify Your Setup

Create a simple test script:

from fabricatio import Role, Action, WorkFlow, Event, Task

class TestAction(Action):
    output_key: str = "test_output"

    async def _execute(self, **_) -> str:
        return "Fabricatio is working!"

role = Role.with_bio().subscribe(
    Event.quick_instantiate("test"),
    WorkFlow(name="test", steps=(TestAction,))
).dispatch()

result = Task(name="verify").delegate_blocking("test")
print(result)  # Should print: "Fabricatio is working!"

Step 4: Configure Your First Agent

Here’s a complete example using configuration:

from fabricatio import Role, Action, WorkFlow, Event, Task
from fabricatio.capabilities import UseLLM

class LLMGreetAction(Action, UseLLM):
    output_key: str = "greeting"

    async def _execute(self, name: str, **_) -> str:
        response = await self.aask(f"Say hello to {name} in one sentence")
        return response

# Create role with custom LLM configuration
role = Role.new(
    {},
    config={
        "llm": {"temperature": 0.7},
        "debug": {"log_level": "DEBUG"}
    }
).subscribe(
    Event.quick_instantiate("greet"),
    WorkFlow(name="greet", steps=(LLMGreetAction,))
).dispatch()

# Run the task
result = Task(name="hello").delegate_blocking("greet", name="World")
print(result)

Configuration Sources & Priority

Fabricatio loads configuration from multiple sources in the following priority order (highest to lowest):

        flowchart TD
    A["1. Call Arguments<br/>(programmatic API)"]
    B["2. .env file in current directory"]
    C["3. Environment Variables"]
    D["4. ./fabricatio.toml"]
    E["5. ./pyproject.toml<br/>[tool.fabricatio]"]
    F["6. &lt;ROAMING&gt;/fabricatio/fabricatio.toml"]
    G["7. Built-in Defaults"]
    A --> B --> C --> D --> E --> F --> G
    

This means you can override any configuration at runtime using environment variables or programmatically.

Configuration File Formats

fabricatio.toml

The primary configuration file format:

[debug]
log_level = "DEBUG"          # DEBUG, INFO, WARNING, ERROR

[llm]
send_to = "base"             # Default routing group
max_completion_tokens = 32000
stream = false
temperature = 1.0
top_p = 0.35
timeout = 120                # seconds

[routing]
providers = [
    { ptype = "OpenAICompatible", key = "sk-...", name = "mm", base_url = "https://api.example.com/v1/" }
]

completion_deployments = [
    { id = "mm/gpt-4o-mini", group = 'base', tpm = 100_000, rpm = 1000 }
]

cache_database_path = "path/to/.cache.db"

[embedding]
send_to = "embeddings"
ndim = 1536
no_cache = false

[reranker]
send_to = "reranker"
no_cache = false

pyproject.toml

Configuration via [tool.fabricatio] table:

[tool.fabricatio.debug]
log_level = "DEBUG"

[tool.fabricatio.llm]
send_to = "base"
max_completion_tokens = 32000
stream = false
temperature = 1.0
top_p = 0.35

[tool.fabricatio.routing]
providers = [
    { ptype = "OpenAICompatible", key = "sk-...", name = "mm", base_url = "https://api.example.com/v1/" }
]

completion_deployments = [
    { id = "mm/gpt-4o-mini", group = 'base', tpm = 100_000, rpm = 1000 }
]

[tool.fabricatio.embedding]
send_to = "embeddings"
ndim = 1536
no_cache = false

[tool.fabricatio.reranker]
send_to = "reranker"
no_cache = false

Environment Variables / .env

Prefix all config keys with FABRICATIO_ and use double underscores for nesting:

FABRICATIO_DEBUG__LOG_LEVEL=DEBUG
FABRICATIO_LLM__SEND_TO=base
FABRICATIO_LLM__MAX_COMPLETION_TOKENS=32000
FABRICATIO_LLM__STREAM=false
FABRICATIO_LLM__TEMPERATURE=1.0
FABRICATIO_LLM__TOP_P=0.35
FABRICATIO_EMBEDDING__SEND_TO=embeddings
FABRICATIO_EMBEDDING__NDIM=1536
FABRICATIO_EMBEDDING__NO_CACHE=false
FABRICATIO_RERANKER__SEND_TO=reranker
FABRICATIO_RERANKER__NO_CACHE=false

Configuration Sections

[debug]

Note

The sphinxcontrib-mermaid package that renders these diagrams is seeking new maintainers. Consider contributing if you’re interested.

        %%{init: {'themeVariables': {'fontFamily': 'monospace'}}}%%
erDiagram
    "[debug]" {
        string log_level "Logging level" "INFO"
    }
    

[llm]

        %%{init: {'themeVariables': {'fontFamily': 'monospace'}}}%%
erDiagram
    "[llm]" {
        string send_to "Default routing group" "base"
        string max_completion_tokens "Max tokens in response" "16000"
        string stream "Enable streaming responses" "false"
        string temperature "Sampling temperature" "1.0"
        string top_p "Nucleus sampling threshold" "1.0"
        string timeout "Request timeout (seconds)" "120"
    }
    

[embedding]

        %%{init: {'themeVariables': {'fontFamily': 'monospace'}}}%%
erDiagram
    "[embedding]" {
        string send_to "Default routing group" ""
        string ndim "Output vector dimensionality" ""
        string no_cache "Disable caching" "false"
    }
    

[reranker]

        %%{init: {'themeVariables': {'fontFamily': 'monospace'}}}%%
erDiagram
    "[reranker]" {
        string send_to "Default routing group" ""
        string no_cache "Disable caching" "false"
    }
    

[routing]

Provider Configuration

Providers define LLM endpoints:

{
    "ptype": "OpenAICompatible",  # Provider type
    "key": "sk-...",               # API key
    "name": "mm",                  # Short name for routing
    "base_url": "https://api.example.com/v1/"  # Endpoint base
}

Supported provider types:

  • OpenAICompatible - OpenAI API compatible endpoints

  • Anthropic - Anthropic API

  • AzureOpenAI - Microsoft Azure OpenAI

  • GoogleAI - Google AI (Gemini)

  • Local - Local model endpoints

Deployment Configuration

Deployments define available models:

{
    "id": "mm/gpt-4o-mini",        # Full model ID (provider/name)
    "group": "base",               # Routing group
    "tpm": 100_000,                # Tokens per minute limit
    "rpm": 1000                    # Requests per minute limit
}

There are three deployment lists in the routing configuration, each for a different model type:

  • completion_deployments - For LLM text generation (chat/completion) models

  • embedding_deployments - For text embedding models

  • reranker_deployments - For reranking/relevance-scoring models

All three use the same deployment schema shown above.

Cache Configuration

[routing]
cache_database_path = "path/to/.cache.db"  # SQLite cache location

Router Features

The thryd crate provides:

  • TPM/RPM limiting: Per-deployment rate limiting

  • Response caching: SQLite-based request caching

  • Concurrent routing: Thread-safe provider selection

Programmatic Configuration

Role-level Configuration

Pass configuration directly to Roles:

from fabricatio import Role

role = Role.new(
    {},
    name="my_agent",
    config={
        "llm": {"temperature": 0.7},
        "debug": {"log_level": "DEBUG"}
    }
)

Action-level Configuration

Override configuration per action call:

from fabricatio.capabilities import UseLLM

class MyAction(Action, UseLLM):
    async def _execute(self, task_input, **kwargs):
        # Use custom temperature for this call
        response = await self.aask(
            "Explain quantum computing",
            temperature=0.3  # Override default
        )
        return response

aask() vs aask_structured()

aask() - Simple text responses:

response: str = await self.aask("What is Python?")

aask_structured() - Typed responses with Pydantic:

from pydantic import BaseModel

class CodeReview(BaseModel):
    issues: list[str]
    score: int
    suggestions: list[str]

result: CodeReview = await self.aask_structured(
    "Review this code",
    response_format=CodeReview
)

Real-World Configuration Examples

Example 1: Single Provider with OpenAI

[debug]
log_level = "INFO"

[llm]
send_to = "openai"
max_completion_tokens = 16000
temperature = 0.7

[routing]
providers = [
    { ptype = "OpenAICompatible", key = "sk-proj-xxxx", name = "openai", base_url = "https://api.openai.com/v1/" }
]

completion_deployments = [
    { id = "openai/gpt-4o", group = 'openai', tpm = 100_000, rpm = 500 },
    { id = "openai/gpt-4o-mini", group = 'openai', tpm = 200_000, rpm = 2000 }
]

Example 2: Multi-Provider with Fallback

[llm]
send_to = "primary"

[routing]
providers = [
    { ptype = "OpenAICompatible", key = "sk-primary-xxx", name = "primary", base_url = "https://api.openai.com/v1/" },
    { ptype = "OpenAICompatible", key = "sk-fallback-xxx", name = "fallback", base_url = "https://api.deepseek.com/v1/" }
]

completion_deployments = [
    { id = "primary/gpt-4o", group = 'primary', tpm = 100_000, rpm = 500 },
    { id = "primary/gpt-4o-mini", group = 'primary', tpm = 200_000, rpm = 2000 },
    { id = "fallback/deepseek-chat", group = 'fallback', tpm = 100_000, rpm = 1000 }
]

Usage:

# Uses primary group (default)
response = await self.aask("Complex task")

# Explicitly use fallback
response = await self.aask("Cost-sensitive task", send_to="fallback")

Example 3: Anthropic with Claude

[llm]
send_to = "claude"
max_completion_tokens = 32000
temperature = 1.0

[routing]
providers = [
    { ptype = "Anthropic", key = "sk-ant-api03-xxx", name = "claude" }
]

completion_deployments = [
    { id = "claude/claude-3-5-sonnet-latest", group = 'claude', tpm = 100_000, rpm = 1000 }
]

Example 4: Azure OpenAI

[routing]
providers = [
    {
        ptype = "AzureOpenAI",
        key = "your-azure-key",
        name = "azure",
        base_url = "https://your-resource.openai.azure.com/",
        api_version = "2024-02-01"
    }
]

completion_deployments = [
    { id = "azure/gpt-4o", group = 'azure', tpm = 100_000, rpm = 500 }
]

Example 5: Local Model Setup

[llm]
send_to = "local"
timeout = 300  # Longer timeout for local models

[routing]
providers = [
    { ptype = "Local", key = "not-needed", name = "ollama", base_url = "http://localhost:11434/v1/" }
]

completion_deployments = [
    { id = "ollama/llama3", group = 'local', tpm = 999_999_999, rpm = 999_999_999 }
]

Advanced: Multiple Provider Setup

Load Balancing Across Providers

[routing]
providers = [
    { ptype = "OpenAICompatible", key = "sk-primary", name = "openai", base_url = "https://api.openai.com/v1/" },
    { ptype = "OpenAICompatible", key = "sk-secondary", name = "azure", base_url = "https://example.azure.com/v1/" }
]

completion_deployments = [
    { id = "openai/gpt-4o", group = 'premium', tpm = 100_000, rpm = 500 },
    { id = "azure/gpt-4o", group = 'premium', tpm = 200_000, rpm = 1000 }
]

Using Different Groups:

# Send to premium group
response = await self.aask("Complex task", send_to="premium")

# Send to base group (default)
response = await self.aask("Simple task", send_to="base")

Environment-Specific Configs

Development (.env.local)

FABRICATIO_DEBUG__LOG_LEVEL=DEBUG
FABRICATIO_LLM__SEND_TO=local
FABRICATIO_ROUTING__CACHE_DATABASE_PATH=.cache.local.db

Staging (fabricatio.staging.toml)

Create fabricatio.staging.toml:

[debug]
log_level = "INFO"

[llm]
send_to = "staging"
timeout = 90

[routing]
providers = [
    { ptype = "OpenAICompatible", key = "sk-staging-xxx", name = "staging", base_url = "https://api.staging.example.com/v1/" }
]

completion_deployments = [
    { id = "staging/gpt-4o-mini", group = 'staging', tpm = 50_000, rpm = 500 }
]

cache_database_path = "/var/cache/fabricatio/.cache.staging.db"

Production (fabricatio.toml)

[debug]
log_level = "WARNING"

[llm]
send_to = "production"
timeout = 60

[routing]
cache_database_path = "/var/cache/fabricatio/.cache.db"

Loading Environment-Specific Config

Fabricatio automatically loads fabricatio.toml from the current directory. To use environment-specific configurations:

# Development
cp fabricatio.dev.toml fabricatio.toml

# Production
cp fabricatio.prod.toml fabricatio.toml

Or use environment variables:

export FABRICATIO_ROUTING__CACHE_DATABASE_PATH="/var/cache/fabricatio/.cache.db"
export FABRICATIO_DEBUG__LOG_LEVEL="WARNING"

Template Discovery Configuration

Fabricatio searches for templates in multiple locations:

        flowchart TD
    A["1. ./templates/\n(project working directory)"]
    B["2. &lt;ROAMING&gt;/fabricatio/templates/"]
    C["3. Built-in templates"]
    A --> B --> C
    

Download templates:

# Manual download
curl -L https://github.com/Whth/fabricatio/releases/download/v0.19.1/templates.tar.gz | tar -xz

# Using bundled CLI
tdown download --verbose -o ./

Caching Configuration

Fabricatio uses SQLite for request caching. Configure cache behavior:

[routing]
cache_database_path = "path/to/.cache.db"  # Custom cache location

Cache Environment Variables:

FABRICATIO_ROUTING__CACHE_DATABASE_PATH=".fabricatio.cache.db"

Cache is automatically enabled when cache_database_path is configured. To disable caching, simply omit this configuration or set it to an empty path.

Troubleshooting

Config not being loaded?

  • Check file location matches expected paths

  • Verify [tool.fabricatio] section in pyproject.toml (not [tool.fabricatio.routing])

  • Enable debug logging: FABRICATIO_DEBUG__LOG_LEVEL=DEBUG

  • Confirm the file is valid TOML syntax

Environment variables not working?

  • Use double underscores: FABRICATIO_LLM__TEMPERATURE (not single)

  • Verify .env file is in the current working directory

  • Check for trailing whitespace in .env file

  • Ensure no quotes around values (unless required)

Provider authentication failures?

  • Verify API key is correct and has no leading/trailing spaces

  • Check base_url includes trailing slash (/v1/)

  • Ensure rate limits (TPM/RPM) are set appropriately

  • For Azure, verify api_version is correct

Rate limit errors (429)?

  • Check TPM/RPM limits in your deployment configuration

  • Reduce request frequency or increase limits

  • Enable caching to reduce API calls

  • Consider adding fallback providers

Request timeout errors?

  • Increase timeout value in [llm] section

  • Check network connectivity to API endpoint

  • For local models, increase timeout to 300+ seconds

Structured output parsing errors?

  • Ensure response_format is a Pydantic BaseModel

  • Check that your model supports structured output

  • Verify temperature is set appropriately (lower values help)

Cache database errors?

  • Ensure the cache directory exists and is writable

  • Check disk space availability

  • Delete the cache file to reset if corrupted

Debug logging not showing?

  • Set log_level = "TRACE" for most verbose output

  • Check that debug config is in the correct section

  • Verify no other configuration is overriding it

Model not found errors?

  • Verify deployment ID matches provider naming (provider/model-name)

  • Check that the model is available in your account/region

  • For local models, ensure the server is running

Common Configuration Patterns

Pattern 1: Development with Local Caching

[debug]
log_level = "DEBUG"

[llm]
send_to = "local"
stream = false

[routing]
providers = [
    { ptype = "OpenAICompatible", key = "sk-dev", name = "dev", base_url = "https://api.openai.com/v1/" }
]

completion_deployments = [
    { id = "dev/gpt-4o-mini", group = 'local', tpm = 999_999_999, rpm = 999_999_999 }
]

cache_database_path = ".dev.cache.db"

Pattern 2: Production with Multiple Tiers

[debug]
log_level = "WARNING"

[llm]
send_to = "premium"
timeout = 60

[routing]
providers = [
    { ptype = "OpenAICompatible", key = "sk-prod", name = "openai", base_url = "https://api.openai.com/v1/" },
    { ptype = "OpenAICompatible", key = "sk-backup", name = "backup", base_url = "https://api.backup.com/v1/" }
]

completion_deployments = [
    { id = "openai/gpt-4o", group = 'premium', tpm = 100_000, rpm = 500 },
    { id = "openai/gpt-4o-mini", group = 'standard', tpm = 200_000, rpm = 2000 },
    { id = "backup/gpt-4o-mini", group = 'backup', tpm = 50_000, rpm = 500 }
]

cache_database_path = "/var/cache/fabricatio/.cache.db"

Pattern 3: Cost-Optimized Setup

[llm]
send_to = "budget"
max_completion_tokens = 8000

[routing]
providers = [
    { ptype = "OpenAICompatible", key = "sk-budget", name = "deepseek", base_url = "https://api.deepseek.com/v1/" }
]

completion_deployments = [
    { id = "deepseek/deepseek-chat", group = 'budget', tpm = 100_000, rpm = 1000 }
]

cache_database_path = ".budget.cache.db"

Migration Guide

Migrating from v0.x to v1.x

The configuration format has been updated:

Old format:

[fabricatio]
log_level = "INFO"

New format:

[debug]
log_level = "INFO"

Key changes:

  • [fabricatio] section renamed to specific sections ([debug], [llm], [routing])

  • Provider configuration uses inline TOML tables

  • Environment variables now use double underscores for nesting

For more information, see the full migration guide.