Skip to content
GitHub

Guide

PackageRequiredPurpose
lexigramYesCore framework
lexigram-contractsYesProtocol definitions
lexigram-ai-llmOptionalLLM tracing
lexigram-ai-ragOptionalRAG tracing
lexigram-ai-agentsOptionalAgent tracing
lexigram-vectorOptionalVector store tracing

LLM calls and vector searches are opaque — they cross process boundaries to external APIs and databases. When a response is slow, costs spike, or errors rise, you need visibility into what happened. Without instrumentation, debugging AI pipelines is guesswork.

lexigram-ai-observability solves this by wrapping your AI clients with distributed tracing, metrics collection, and health monitoring — automatically.

The package has three pillars:

PillarClassWhat It Tracks
TracingAITracerPer-call spans for LLM completions, vector operations, RAG stages, and embeddings
MetricsAIMetricsCounters, histograms, and gauges for tokens, costs, latency, request volume, and cache hit rates
HealthAIHealthMonitorRegistered health checks for LLM endpoints, vector stores, and embedding services

These three work together via ObservabilityProvider, which auto-wires them around any LLMClientProtocol or VectorStoreProtocol registered in the container.

AITracer creates OpenTelemetry-compatible spans for every AI operation:

from lexigram.ai.observability import AITracer
# Typical usage — spans are created automatically by ObservableLLMClient
with tracer.trace_llm_call("openai", "gpt-4o") as span:
response = await client.complete(messages)
span.set_attribute("llm.tokens.total", response.usage.total_tokens)

Available span helpers:

MethodCreates Span Named
trace_llm_call(provider, model)llm.{provider}.{model}
trace_vector_operation(op, provider, collection)vector.{op}.{provider}
trace_embedding_operation(model)embedding.{model}
trace_rag_stage(stage, pipeline)rag.{stage}
trace_rag_query(query)rag.query

AIMetrics registers pre-defined instruments through MetricsCollectorProtocol. All metric names use the intelligence_ prefix:

from lexigram.ai.observability import AIMetrics
# Count a successful LLM call
metrics.llm_requests_total.increment(
labels={"provider": "openai", "model": "gpt-4o", "status": "success"}
)
# Record latency
metrics.llm_duration_seconds.observe(value=0.8, labels={"provider": "openai", "model": "gpt-4o"})

The killer feature: ObservabilityProvider.boot() detects existing LLMClientProtocol and VectorStoreProtocol registrations in the container and replaces them with ObservableLLMClient / ObservableVectorStore proxies. The proxy delegates every call to the original while recording spans and metrics.

# After boot, this call is automatically traced and metered:
result = await llm_client.complete(messages)
# No code changes needed — the wrapping was transparent.

AIHealthMonitor manages health checks for AI infrastructure:

from lexigram.ai.observability import AIHealthMonitor
monitor = AIHealthMonitor()
monitor.add_llm_check("openai", check_openai_connectivity)
monitor.add_vector_check("pgvector", check_pgvector_connectivity)
all_healthy = await monitor.is_ready()
from lexigram import Application, LexigramConfig
from lexigram.ai.observability import ObservabilityModule
from lexigram.ai.llm import LLMModule
from lexigram.ai.observability.config import ObservabilityConfig
config = LexigramConfig.from_yaml({
"ai_observability": {
"enabled": True,
"tracing_enabled": True,
"metrics_enabled": True,
"health_checks_enabled": True,
}
})
app = Application(name="observable-app", config=config)
app.add_module(LLMModule.configure())
app.add_module(ObservabilityModule.configure())
await app.start()
# Your AI calls are now instrumented

For custom functions that aren’t auto-wrapped, use the decorator API:

from lexigram.ai.observability import trace_llm, track_llm_call
@trace_llm(provider="openai", model="gpt-4o", tracer=tracer)
@track_llm_call(provider="openai", model="gpt-4o", metrics=metrics)
async def my_completion(messages):
return await client.complete(messages)
config = ObservabilityConfig(
enabled=True,
metrics_enabled=False, # disable metrics, keep tracing
tracing_enabled=True,
)

Labels flow through AIMetrics to MetricsCollectorProtocol — add them everywhere for granular breakdowns:

metrics.llm_requests_total.increment(labels={
"provider": "openai",
"model": "gpt-4o",
"status": "success",
"deployment": "prod-eu-west-1",
})

The package emits lifecycle hooks you can subscribe to:

from lexigram.ai.observability.hooks import LLMCallTracedHook
# Hook payload fired after each traced LLM call
  • Enable tracing and metrics in production — the overhead is minimal (MetricsCollectorProtocol is designed for hot paths).
  • Register health checks for every external AI dependency (LLM provider, vector store).
  • Use tracing_enabled=false during local development if you don’t need spans.
  • Attach meaningful span attributes (tokens.total, cost, error.type) via the wrapper or decorator callbacks.
  • Set up OpenTelemetry export with [opentelemetry] extras for Jaeger, Zipkin, or cloud traces.