Troubleshooting
WorkerError: Worker not starting
Section titled “WorkerError: Worker not starting”Error: Log shows worker_resolve_failed for a worker type.
Cause: The worker’s constructor dependencies are not registered in the DI container.
Fix: Ensure that VectorStoreProtocol, TaskQueueProtocol, and EmbeddingProvider implementations are registered before the worker provider boots.
container.singleton(VectorStoreProtocol, PgVectorStore(config=...))container.singleton(TaskQueueProtocol, RedisTaskQueue(url=...))RuntimeError: Failed to enqueue embedding job
Section titled “RuntimeError: Failed to enqueue embedding job”Error: RuntimeError: Failed to enqueue embedding job: ...
Cause: The TaskQueueProtocol.enqueue() returned Err(...).
Fix: Check that the task queue backend (Redis, RabbitMQ, Postgres) is running and reachable.
result = await queue.enqueue(job_data)if result.is_err(): logger.error("enqueue_failed", error=str(result.unwrap_err()))MaintenanceTask never runs
Section titled “MaintenanceTask never runs”Cause: Either enable_maintenance: false in config, or the task was registered without a schedule.
Fix: Verify config and ensure each task has interval_seconds or schedule_cron.
maint.register_task( name="cleanup", task_type=MaintenanceTaskType.CACHE_CLEANUP, handler=cleanup_fn, interval_seconds=3600, # Required)DeadLetterQueue stops processing
Section titled “DeadLetterQueue stops processing”Error: No retries happening for items in DLQ.
Cause 1: dlq_check_interval is too high (default 60s). Retry check is a background loop.
Cause 2: Items have failure_category == PERMANENT — these are never retried.
Cause 3: retry_count >= max_retries (default 5).
Fix: Check DLQItem.can_retry() to understand why an item is skipped.
item = dlq._items.get("job-id")if item: print(f"Can retry: {item.can_retry()}, retries: {item.retry_count}/{item.max_retries}")BatchEmbeddingProgress shows 0% forever
Section titled “BatchEmbeddingProgress shows 0% forever”Cause: The BatchEmbeddingWorker could not resolve a TaskWorkerProtocol from the container, so _handle_batch_embed never executes.
Fix: Register a TaskWorkerProtocol implementation before booting the workers provider.
container.singleton(TaskWorkerProtocol, RedisTaskWorker)Document ingestion fails silently
Section titled “Document ingestion fails silently”Cause: The DocumentParser (default UniversalDocumentParser) cannot parse the file format.
Fix: Pass a custom parser that handles the file format.
from lexigram.ai.workers.document_ingestion.parser import DocumentParser
class PDFParser(DocumentParser): async def parse(self, path: Path) -> list[ChunkProtocol]: ...
worker = DocumentIngestionWorker( vector_store=store, queue=queue, document_parser=PDFParser(),)Debug Tips
Section titled “Debug Tips”- Set
LOG_LEVEL=DEBUGto see worker lifecycle events. - Check
worker.health_check()for running state. - Use
worker.get_stats()to inspect progress and history. - Run
WorkersModule.stub()in tests to avoid background task interference.