Full-Text Search
lexigram-search provides async full-text search behind a single protocol. Application code depends on SearchEngineProtocol; the backend (Meilisearch, Elasticsearch, Typesense, Postgres, MySQL, SQLite, MongoDB, or in-memory) is chosen in configuration. The same indexing and query code runs on a zero-dependency SQLite FTS5 backend in development and on a managed Elasticsearch cluster in production.
For the full configuration reference and backend matrix, see the lexigram-search package docs.
1. The Contract
Section titled “1. The Contract”All backends implement SearchEngineProtocol from lexigram-contracts. The protocol covers indexing, bulk indexing, query execution, deletion, and health:
from typing import Any, Protocol, runtime_checkablefrom lexigram.contracts.core import HealthCheckResultfrom lexigram.contracts.data import QueryResult
@runtime_checkableclass SearchEngineProtocol(Protocol): async def index_document( self, document_id: str, document: dict[str, Any], index_name: str | None = None, ) -> None: ... async def index_many( self, documents: list[tuple[str, dict[str, Any]]], index_name: str | None = None, ) -> None: ... async def search( self, query: str, filters: dict[str, Any] | None = None, sort: list[dict[str, str]] | None = None, limit: int | None = None, offset: int | None = None, ) -> QueryResult: ... async def delete_document( self, document_id: str, index_name: str | None = None, ) -> None: ... async def health_check(self, timeout: float = 5.0) -> HealthCheckResult: ...Services depend on the protocol — never on a concrete backend:
graph LR
Svc[Your Service] -- depends on --> P[SearchEngineProtocol]
P -- resolved to --> Meili[MeiliSearchBackend]
P -- or --> ES[ElasticsearchBackend]
P -- or --> TS[TypesenseBackend]
P -- or --> SQL[SQLite / Postgres / MySQL FTS]
2. Configuration
Section titled “2. Configuration”Register SearchModule and configure the search: block. SearchModule.configure() requires an explicit SearchConfig; use SearchModule.stub() for tests.
from lexigram import Applicationfrom lexigram.di.module import Module, modulefrom lexigram.search import SearchModule, SearchConfig
@module(imports=[SearchModule.configure(SearchConfig())])class AppModule(Module): pass
app = Application(modules=[AppModule])search: enabled: true backend_type: meilisearch # meilisearch | elasticsearch | typesense # | postgres | mysql | sqlite | mongodb | memory timeout: 30.0 meilisearch: url: "${MEILI_URL:http://localhost:7700}" api_key: "${MEILI_API_KEY}" searchable_attributes: [name, description, tags] filterable_attributes: [category, in_stock] sortable_attributes: [created_at, price] query: strategy: fuzzy # fuzzy | exact | semantic | hybrid default_limit: 10 max_limit: 100 enable_faceting: true operations: bulk_chunk_size: 500For local development the sqlite backend uses SQLite FTS5 with no external service:
search: backend_type: sqlite sqlite: db_path: ":memory:" # or a file path tokenizer: "porter unicode61" auto_create_tables: true3. Indexing Documents
Section titled “3. Indexing Documents”Inject SearchEngineProtocol and index documents as plain dicts. The document ID is separate from the document payload:
from lexigram.contracts.search import SearchEngineProtocolfrom my_app.domain.models import Product
class ProductIndexer: def __init__(self, search: SearchEngineProtocol) -> None: self._search = search
async def index(self, product: Product) -> None: await self._search.index_document( document_id=product.id, document={ "id": product.id, "name": product.name, "description": product.description, "tags": product.tags, "category": product.category, "price": product.price, "in_stock": product.stock > 0, }, index_name="products", )
async def remove(self, product_id: str) -> None: await self._search.delete_document(product_id, index_name="products")For bulk loads (reindex jobs, importers) call index_many once per chunk instead of looping index_document — backends route this to their native bulk API and respect operations.bulk_chunk_size:
batch: list[tuple[str, dict[str, Any]]] = [ (p.id, {"id": p.id, "name": p.name, "tags": p.tags}) for p in products]await self._search.index_many(batch, index_name="products")4. Querying
Section titled “4. Querying”Pass a free-text query plus optional filters, sort, and pagination. The result exposes hits, total count, and (when supported) facets:
class ProductSearch: def __init__(self, search: SearchEngineProtocol) -> None: self._search = search
async def find(self, term: str, category: str | None, page: int) -> dict: result = await self._search.search( query=term, filters={"in_stock": True, "category": category} if category else {"in_stock": True}, sort=[{"price": "asc"}], limit=20, offset=(page - 1) * 20, ) return { "hits": [r.data for r in result.results], "total": result.total, "took_ms": result.took_ms, }SearchResponse (from lexigram.search.types) carries:
results: list[SearchResult]— each withid,score,data, optionalhighlightstotal: int— total matching documents (not just the returned page)page,per_page,query,took_msfacets: dict[str, Any] | None— populated when faceting is enabled and supported by the backend
5. Facets
Section titled “5. Facets”When query.enable_faceting: true and the backend supports them (Meilisearch, Elasticsearch, Typesense), SearchResponse.facets is populated with bucket counts per declared filterable attribute:
result = await self._search.search(query="laptop", limit=20)# result.facets → {"category": {"electronics": 42, "office": 7}, "in_stock": {"true": 38, "false": 11}}Each backend has its own rules for which fields are facetable — for Meilisearch, list them under meilisearch.filterable_attributes. The SQL backends (sqlite, postgres, mysql) do not currently emit facets.
6. Multiple Backends
Section titled “6. Multiple Backends”Declare backends: to register more than one search engine. The primary is bound to the unnamed SearchEngineProtocol; each entry is also bound under Named(entry.name):
search: backends: - name: catalog primary: true backend_type: meilisearch meilisearch: url: "${MEILI_URL}" - name: audit backend_type: postgres database: audit_dbfrom typing import Annotatedfrom lexigram.contracts.search import SearchEngineProtocolfrom lexigram.di.markers import Named
class AuditTrail: def __init__( self, catalog: SearchEngineProtocol, # primary audit: Annotated[SearchEngineProtocol, Named("audit")], ) -> None: ...postgres and mysql backends resolve a named DatabaseProviderProtocol from the container at boot — see Database & Persistence for declaring those.
7. Testing
Section titled “7. Testing”For unit tests, SearchModule.stub() wires an in-memory (null) backend that satisfies SearchEngineProtocol with no external service:
from lexigram import Applicationfrom lexigram.search import SearchModulefrom lexigram.contracts.search import SearchEngineProtocol
async def test_indexes_and_finds_product() -> None: async with Application.boot(modules=[SearchModule.stub()]) as app: search = await app.container.resolve(SearchEngineProtocol) await search.index_document("p1", {"name": "Laptop"}, index_name="products") result = await search.search(query="Laptop") assert result.total >= 0 # null backend is a no-op stubFor integration tests against real search semantics, prefer SearchModule.configure(SearchConfig(backend_type=BackendType.SQLITE)) — SQLite FTS5 has no external dependency and exercises real tokenization and ranking.
Next Steps
Section titled “Next Steps”- Database & Persistence — the source of truth for reindex jobs and the home of the SQL-backed search options
- Dependency Injection — binding
SearchEngineProtocolto a backend lexigram-searchpackage — full backend matrix, analytics, suggestion engine, and federated search