Skip to content
GitHub

Route by criteria

Alpha (0.1.x) — MIT licensed. Public API may change before 1.0.

lexigram-ai-llm provides built-in clients for the following providers:

ProviderClient classAuth methodExtraStreamingThinkingToolsStructured output
OpenAIOpenAIClientapi_keyopenaiYesVia reasoning_effortYesJSON mode / function calling
AnthropicAnthropicClientapi_keyanthropicYesExtended thinkingYesTool use
GeminiGeminiClientapi_keygoogle-genaiYesNative thinkingYesJSON mode
OllamaOllamaClientNoneollamaYesVia modelYesJSON mode
GroqGroqClientapi_keygroqYesVia modelYesJSON mode
MistralMistralClientapi_keymistralaiYesYesJSON mode
CohereCohereClientapi_keycohereYesYesJSON mode
OpenRouterOpenRouterClientapi_keyopenaiYesVia model routingYesJSON mode
Azure OpenAIAzureOpenAIClientapi_key + endpointopenaiYesreasoning_effortYesJSON mode
AWS BedrockBedrockClientAWS credentialsboto3YesVia modelYesTool use
Vertex AIVertexAIClientGCP credentialsgoogle-cloud-aiplatformYesVia modelYesJSON mode
Cloudflare Workers AICloudflareWorkersAIClientapi_key + account IDrequestsYes
OpenAI-CompatibleOpenAICompatibleClientapi_key (optional)openaiYesVia modelYesJSON mode

API-key providers accept SecretStr via ClientConfig:

from lexigram.ai.llm.config import ClientConfig
from lexigram.validation import SecretStr
config = ClientConfig(
provider="openai",
model="gpt-4o",
api_key=SecretStr("sk-..."),
)

For providers using SDK credential chains (AWS Bedrock, Vertex AI), set credentials via standard environment variables (AWS_PROFILE, GOOGLE_APPLICATION_CREDENTIALS).

All providers return an AsyncStream[StreamChunk, LLMError] from stream_chat():

from lexigram.ai.llm.types import StreamChunk
async for chunk in client.stream_chat(messages):
print(chunk.content, end="")

Configure via ThinkingConfig in ClientConfig:

from lexigram.contracts.ai.thinking import ThinkingConfig
config = ClientConfig(
provider="anthropic",
model="claude-3-7-sonnet-20250219",
thinking=ThinkingConfig(budget_tokens=5000),
)

Supported providers: Anthropic (extended thinking), Gemini (native thinking), OpenAI (reasoning_effort).

from lexigram.ai.llm.types import ToolCall
result = await client.complete(messages, tools=[my_tool_definition])
if result.is_ok():
for call in result.unwrap().tool_calls:
print(f"Calling {call.function.name}")
from lexigram.ai.llm.structured.extractor import JSONExtractor
from lexigram.ai.llm.structured.parser import StructuredOutputParser
parser = StructuredOutputParser(model="gpt-4o")
result = await parser.parse(messages, output_schema=MyModel)

LLMRoutingProvider (from lexigram.ai.llm.di.routing_provider) enables multi-provider dispatch via ProviderRegistry:

from lexigram.ai.llm.registry.core import ProviderRegistry
from lexigram.ai.llm.selection.core import ModelSelector
registry = ProviderRegistry()
selector = ModelSelector(registry=registry)
# Route by criteria
model = selector.select(query, criteria=SelectionCriteria(min_capabilities={ModelCapabilities.STREAMING}))

The ProviderRegistry maps provider names to client instances. Routes can be cost-optimized, quality-optimized, or balanced via the selector strategies.

Register a client in the ProviderRegistry:

from lexigram.ai.llm.registry.core import ProviderRegistry
from lexigram.contracts.ai import LLMClientProtocol
class MyCustomClient(LLMClientProtocol):
async def complete(self, messages, **kwargs):
...
def stream_chat(self, messages):
...
registry = ProviderRegistry()
await registry.register_provider("my_custom", MyCustomClient(), models=["my-model-v1"])

For full DI integration, subclass the provider pattern or configure via LLMProvider with a custom factory.

  • OpenAI Compatible: Set api_base to the custom endpoint URL. Works with any OpenAI-compatible API (vLLM, LiteLLM, etc.).
  • Anthropic: Extended thinking requires model claude-3-7-sonnet-20250219 or later. budget_tokens controls the thinking budget.
  • Gemini: The google-genai library must be installed. API key or ADC credentials.
  • AWS Bedrock: Requires boto3 and configured AWS credentials. Cross-region inference supported.
  • Ollama: No api_key needed when running locally. Set api_base for remote instances.
  • LLMProvider — DI registration for a single provider
  • LLMRoutingProvider — multi-provider routing setup
  • ProviderRegistry — register and query provider clients
  • ModelSelector — cost/quality/balanced model selection
  • ClientConfig — provider, model, auth, and feature config