AI Llm (lexigram-ai-llm)
LLM client layer for the Lexigram Framework — OpenAI, Anthropic, Ollama, Cohere, Groq, Mistral
Overview
Section titled “Overview”LLM client layer for the Lexigram Framework. Provides typed, async-first clients for 18 providers, multi-provider routing, thinking/reasoning control, structured extraction, streaming, embeddings, and model management — all wired through the DI container via LLMModule. Zero-config usage starts with sensible defaults.
Install
Section titled “Install”uv add lexigram-ai-llm# Optional extrasuv add "lexigram-ai-llm[openai,anthropic,ollama]"Quick Start
Section titled “Quick Start”from lexigram import Applicationfrom lexigram.di.module import Module, module
from lexigram.ai.llm import LLMModulefrom lexigram.ai.llm.config import ClientConfig
@module(imports=[ LLMModule.configure( ClientConfig(provider="anthropic", model="claude-sonnet-4-6") )])class AppModule(Module): pass
app = Application(modules=[AppModule])if __name__ == "__main__": app.run()Configuration
Section titled “Configuration”Zero-config usage: Call
LLMModule.configure()with no arguments to use defaults.
Option 1 — YAML file
Section titled “Option 1 — YAML file”ai_llm: provider: "anthropic" model: "claude-sonnet-4-6" api_key: "${LEX_AI_LLM__API_KEY}" temperature: 0.7 max_tokens: nullOption 2 — Profiles + Environment Variables (recommended)
Section titled “Option 2 — Profiles + Environment Variables (recommended)”export LEX_AI_LLM__PROVIDER=anthropic# Environment variables for each fieldOption 3 — Python
Section titled “Option 3 — Python”from lexigram.ai.llm.config import ClientConfigfrom lexigram.ai.llm import LLMModule
config = ClientConfig( provider="anthropic", model="claude-sonnet-4-6",)LLMModule.configure(config)Config reference
Section titled “Config reference”| Field | Default | Env var | Description |
|---|---|---|---|
enabled | True | LEX_AI_LLM__ENABLED | Enable the LLM subsystem |
provider | openai | LEX_AI_LLM__PROVIDER | LLM provider |
model | gpt-4-turbo | LEX_AI_LLM__MODEL | Model name |
api_key | None | LEX_AI_LLM__API_KEY | Provider API key |
api_base | None | LEX_AI_LLM__API_BASE | Custom endpoint (Azure, local, proxy) |
temperature | 0.7 | LEX_AI_LLM__TEMPERATURE | Sampling temperature (0.0–2.0) |
max_tokens | None | LEX_AI_LLM__MAX_TOKENS | Response token limit |
timeout | 60.0 | LEX_AI_LLM__TIMEOUT | Request timeout in seconds |
enable_cache | False | LEX_AI_LLM__ENABLE_CACHE | Cache responses |
cache_ttl | 3600 | LEX_AI_LLM__CACHE_TTL | Cache TTL in seconds |
thinking | None | — | Reasoning/thinking control configuration |
Module Factory Methods
Section titled “Module Factory Methods”| Method | Description |
|---|---|
LLMModule.configure(config) | Single-provider client |
LLMModule.configure(routing=LLMConfig()) | Multi-provider routing cascade |
LLMModule.stub() | No-op client for tests |
Key Features
Section titled “Key Features”- 18 providers: OpenAI, Anthropic, Google Gemini, Azure, Ollama, Groq, Mistral, Cohere, and more
- Multi-provider routing: Sequential, cost-optimized, and latency-optimized strategies
- Thinking/reasoning control: Extended thinking with token budget and suppression
- Structured extraction: JSON schema and Pydantic model extraction
- Streaming: Async streaming response support
- Embeddings: Text embedding client with same provider
- Caching: Response-level caching with configurable TTL
Testing
Section titled “Testing”async with Application.boot(modules=[LLMModule.stub()]) as app: # your test code ...Key Source Files
Section titled “Key Source Files”| File | What it contains |
|---|---|
src/lexigram/ai/llm/module.py | LLMModule.configure() and LLMModule.stub() |
src/lexigram/ai/llm/config.py | ClientConfig |
src/lexigram/ai/llm/routing/config.py | LLMConfig, ProviderConfig for routing |
src/lexigram/ai/llm/di/provider.py | LLMProvider — registers and boots the client |
src/lexigram/ai/llm/clients/ | Provider implementations |
src/lexigram/ai/llm/thinking/ | ThinkingConfig handling and suppression |
src/lexigram/ai/llm/exceptions.py | Full exception hierarchy |