Skip to content
GitHub

Architecture

Internal design of the lexigram-ai-observability package.


lexigram-ai-observability provides distributed tracing, metrics collection, and health monitoring for AI operations (LLM calls, vector search, embeddings, RAG, document ingestion). It depends only on lexigram and lexigram-contracts — it discovers LLMClientProtocol and VectorStoreProtocol via the container and wraps them transparently.

flowchart BT
    subgraph App[Application]
        LLM[LLMClientProtocol]
        VEC[VectorStoreProtocol]
    end
    subgraph OBS[lexigram-ai-observability]
        subgraph Wrap[Wrappers]
            OLLM[ObservableLLMClient]
            OVEC[ObservableVectorStore]
        end
        TR[AITracer]
        MT[AIMetrics]
        HM[AIHealthMonitor]
        CB[CallbackManagerImpl]
        DEC[trace_llm · track_llm_call<br/>trace_vector · track_vector_operation]
    end
    subgraph MON[lexigram-monitor]
        T[TracerProtocol]
        MC[MetricsCollectorProtocol]
        HCR[HealthCheckRegistryProtocol]
    end

    LLM --> OLLM
    VEC --> OVEC
    OLLM --> TR
    OLLM --> MT
    OVEC --> TR
    OVEC --> MT
    DEC --> TR
    DEC --> MT
    TR --> T
    MT --> MC
    HM --> HCR
    CB --> TR

The arrow direction points toward the dependency. Wrappers depend on AITracer and AIMetrics. AITracer depends on TracerProtocol. Application code observes LLM/Vector operations through decorators or automatic proxy wrapping.


src/lexigram/ai/observability/
├── __init__.py # Lazy-loaded public API
├── config.py # ObservabilityConfig dataclass
├── constants.py # ENV_PREFIX, metric/span name constants
├── decorators.py # Re-exports of trace_llm, track_llm_call, etc.
├── exceptions.py # ObservabilityError, TracingError, MetricsError
├── hooks.py # AIObservabilityStartedHook, LLMCallTracedHook
├── protocols.py # Re-exports of AITracerProtocol, etc. from contracts
├── types.py # HealthCheckFunc, MetricLabels
├── module.py # ObservabilityModule
├── di/
│ └── provider.py # ObservabilityProvider (register, boot, shutdown)
├── tracing/
│ ├── core.py # AITracer — span context manager API
│ └── decorators.py # @trace_llm, @trace_vector, @trace_rag
├── metrics/
│ ├── core.py # AIMetrics — all 30+ instrument definitions
│ └── decorators.py # @track_llm_call, @track_vector_operation, @track_embedding
├── health/
│ └── monitor.py # AIHealthMonitor — per-component check registry
├── wrappers/
│ ├── observable_llm.py # ObservableLLMClient — LLM proxy
│ └── observable_vector.py # ObservableVectorStore — vector proxy
└── callbacks/
└── manager.py # CallbackManagerImpl — event fan-out

AITracer wraps a TracerProtocol (from lexigram-monitor) and exposes domain-specific span methods. Each span carries standardised attributes for LLM provider, model, token counts, latency, and cost.

MethodSpan NameKey Attributes
trace_llm_call()llm.{provider}.{model}llm.provider, llm.model, operation.type
trace_vector_operation()vector.{operation}.{provider}vector.operation, vector.provider, vector.collection
trace_embedding_operation()embedding.{model}embedding.model, embedding.batch_size
trace_rag_stage()rag.{stage}rag.stage, rag.pipeline
trace_rag_query()rag.queryrag.query, rag.pipeline
sequenceDiagram
    participant Caller as Application Code
    participant T as AITracer
    participant TracerP as TracerProtocol (Monitor)
    participant Span as Span
    participant Export as Export Backend

    Caller->>T: trace_llm_call("openai", "gpt-4")
    T->>TracerP: start_span(name, attributes)
    TracerP-->>Caller: Span context manager
    Caller->>Span: __enter__ → set_attribute("status", "success")
    Caller->>Span: __exit__ → end()
    Span->>Span: record attributes & events
    Span-->>TracerP: span end
    TracerP->>Export: export span
    Export-->>Export: Console / OTLP / Datadog

AIMetrics registers all instruments against MetricsCollectorProtocol (from lexigram-monitor). 24 instruments total, grouped by domain:

DomainInstrumentsTypes
LLMllm_requests_total, llm_tokens_total, llm_duration_seconds, llm_cost_dollars, llm_active_requestsCounter, Histogram, Gauge
Vectorvector_operations_total, vector_duration_seconds, vector_documents_total, vector_collection_sizeCounter, Histogram, Gauge
Embeddingembedding_operations_total, embedding_duration_seconds, embedding_batch_size, embedding_cache_hits/misses, embedding_cache_sizeCounter, Histogram, Gauge
RAGrag_queries_total, rag_duration_seconds, rag_documents_retrieved, rag_active_queriesCounter, Histogram, Gauge
Ingestiondocument_ingestion_jobs_submitted/completed/failed, document_chunks_created, ingestion_workers_activeCounter, Gauge
Batchbatch_embedding_jobs_submitted/completed/failed, texts_processed, workers_activeCounter, Gauge
Maintenanceworkers_active, tasks_completed/failed, task_duration_secondsCounter, Histogram, Gauge
DLQitems_total/added/retried/archived/deleted, workers_active, notifications_sentCounter, Gauge

Tracing and metrics are exported through lexigram-monitor, which provides an Exporter abstraction chain. Backends are injected at the monitor layer — lexigram-ai-observability never couples to a specific exporter.

BackendTracingMetricsConfiguration
ConsoleYesYesLEX_MONITOR__EXPORTER=console
OpenTelemetry (OTLP)YesYesLEX_MONITOR__EXPORTER=otlp
Datadog (via OTLP)YesYesDatadog OTLP endpoint config
PrometheusNoYesLEX_MONITOR__EXPORTER=prometheus
CustomYesYesImplement TracerProtocol / MetricsCollectorProtocol

ObservabilityProvider (di/provider.py) handles three phases:

sequenceDiagram
    participant App as Application
    participant P as ObservabilityProvider
    participant C as Container

    Note over App,C: register() phase
    App->>P: register(container)
    P->>P: Check enabled flag
    P->>C: singleton(ObservabilityConfig)
    P->>C: singleton(AITracer)
    P->>C: singleton(AITracerProtocol → AITracer)
    P->>C: singleton(AIMetrics)
    P->>C: singleton(AIMetricsProtocol → AIMetrics)
    P->>C: singleton(AIHealthMonitor)
    P->>C: singleton(AIHealthMonitorProtocol → AIHealthMonitor)

    Note over App,C: boot() phase
    App->>P: boot(container)
    P->>C: resolve(AITracer)
    P->>C: resolve(AIMetrics)
    P->>C: resolve(AIAuditStoreProtocol) [optional]

    P->>C: resolve(LLMClientProtocol)
    P->>P: Wrap in ObservableLLMClient
    P->>C: singleton(LLMClientProtocol → ObservableLLMClient)

    P->>C: resolve(VectorStoreProtocol)
    P->>P: Wrap in ObservableVectorStore
    P->>C: singleton(VectorStoreProtocol → ObservableVectorStore)

    Note over App,C: shutdown() phase
    App->>P: shutdown()
    P->>P: No-op (handled by monitor layer)

Key design rule: During boot(), the provider re-registers the wrapped protocol instances under the same protocol key. Any code that already resolved LLMClientProtocol before boot keeps the raw client; code that resolves after boot gets the wrapped version. This is intentional — services that boot after ObservabilityProvider automatically receive instrumented proxies.


ProtocolSourceConsumed ByRole
AITracerProtocollexigram.contracts.observability.aiAITracerAI tracing API
AIMetricsProtocollexigram.contracts.observability.aiAIMetricsAI metrics API
AIHealthMonitorProtocollexigram.contracts.observability.aiAIHealthMonitorAI health check API
ObservabilityProtocollexigram.contracts.observability.ai— (composite)Combined observability
TracerProtocollexigram.contracts.observability.tracingAITracer (injected)Span creation & context
SpanProtocollexigram.contracts.observability.tracingAITracer (returned)Span attribute/event API
MetricsCollectorProtocollexigram.contracts.observability.metricsAIMetrics (injected)Instrument creation
LLMClientProtocollexigram.contracts.aiObservableLLMClient (wrapped)Proxied LLM calls
VectorStoreProtocollexigram.contracts.data.vector.protocolsObservableVectorStore (wrapped)Proxied vector ops
AIAuditStoreProtocollexigram.contracts.ai.governanceObservableLLMClient (optional)Audit event emission
CallbackHandlerProtocollexigram.contracts.ai.callbacksAITracer, CallbackManagerImplObserve callbacks

PointMechanismExample
Custom trace backendImplement TracerProtocol, register in containerOpenTelemetry SDK, Jaeger
Custom metrics backendImplement MetricsCollectorProtocol, register in containerPrometheus, StatsD, Datadog
Custom span processorProvide a span-processor to TracerProtocol during monitor setupAttribute redaction, sampling
Custom metric collectorProvide a MetricsBackendProtocol to MetricsCollectorProtocolCloudWatch, InfluxDB
Health check registrarCall AIHealthMonitor.add_llm_check() / add_vector_check()Custom provider ping
Decorator-based tracing@trace_llm(provider, model, tracer)Application orchestration code
Decorator-based metrics@track_llm_call(provider, model, metrics)Batch processing pipelines
Lifecycle hooksSubscribe to LLMCallTracedHook / HealthCheckRunHookAlerting, compliance logging
Callback handlersImplement CallbackHandlerProtocol, register via CallbackManagerImplCustom event processing
New wrappable protocolCreate wrappers/observable_*.py, register in ObservabilityProvider.boot()Hypothetical EmbeddingClientProtocol
  1. Create an Observable*Client wrapper in wrappers/ that delegates to the raw protocol with tracing/metrics injection.
  2. In ObservabilityProvider.boot(), resolve the protocol from the container and wrap it.
  3. Re-register the wrapped instance under the same protocol key via container.singleton().

lexigram/ai/observability/module.py
@module()
class ObservabilityModule(Module):
@classmethod
def configure(cls, config: ObservabilityConfig | dict | None = None) -> DynamicModule:
provider = ObservabilityProvider(config=...)
return DynamicModule(
module=cls,
providers=[provider],
exports=[AITracerProtocol],
)

The module exports AITracerProtocol so that dependent services can inject the tracer without importing the implementation package.


AIError (contracts)
└── ObservabilityError # LEX_ERR_OBS_001 — base for this package
├── TracingError # LEX_ERR_OBS_004 — span/trace operations
├── MetricsError # LEX_ERR_OBS_003 — metric recording operations
└── HealthCheckError # LEX_ERR_OBS_002 — health check failures

All exceptions are leaf-level — callers catch ObservabilityError for observability failures or let them propagate as infrastructure errors (the system should degrade gracefully when monitoring is unavailable).


Defined in constants.py:

SymbolValueDescription
ENV_PREFIXLEX_AI_OBSERVABILITY__Env var prefix for config overrides
METRIC_PREFIX_LLMlexigram.ai.llmMetric namespace for LLM ops
METRIC_PREFIX_VECTORlexigram.ai.vectorMetric namespace for vector ops
SPAN_LLM_CALLllm.callDefault LLM span name
SPAN_VECTOR_QUERYvector.queryDefault vector span name
SPAN_RAG_PIPELINErag.pipelineDefault RAG span name
DEFAULT_CHECK_INTERVAL30Default health check interval (s)
DEFAULT_CHECK_TIMEOUT5.0Default health check timeout (s)

ObservabilityConfig is loaded from the ai_observability: key in application.yaml with LEX_AI_OBSERVABILITY__* env var overrides:

FieldTypeDefaultDescription
enabledboolTrueMaster on/off switch
metrics_enabledboolTrueEnable metrics collection
tracing_enabledboolTrueEnable distributed tracing
health_checks_enabledboolTrueEnable AI component health checks

In production, disabling tracing or metrics emits a ConfigIssue warning with remediation suggestions.