Architecture
Design Goals
- One workflow model for local and distributed execution.
- Simple Python-first API (
phrase, @lang.verb, lang.say).
- Optional persistence and observability without forcing heavy infrastructure in unit tests.
- Predictable transport semantics for non-JSON objects (Corpus, Reference, Pydantic models, bytes, Path).
Layered Model
- Definition Layer:
lingo.phrase
Defines lazy task graphs.
- Runtime Layer:
lingo.celery.language.CeleryLanguage
Dispatches graphs in local/distributed modes and tracks retries.
- Transport Layer:
lingo.celery.channel and lingo.celery.protocol
Encodes envelope metadata and serializes values safely across process boundaries.
- Persistence Layer:
lingo.storage.mongo.MongoJournal
Stores status, events, retry history, DAG metadata, and worker capabilities.
- Artifact Layer:
lingo.bucket.archive.MinioArchive
Handles larger payload/object references and signed URLs.
- Operator Layer:
lingo.cli and lingo.moderator
CLI workflows and test orchestration helpers.
Dispatch Lifecycle (High Level)
- Build phrase DAG using
phrase(...).then(...).
lang.say(...) assigns job id, persists initial state, and decides local vs distributed route.
- Payload serialized using
serialize_value/serialize_phrase.
- Worker executes tasks and emits timeline events.
- Journal records status transitions and derived progress.
- Callers query status/result/progress via
Journal APIs.
Why This Matters for AI Assistants
- The boundaries between phrase definition, runtime dispatch, and persistence are explicit.
- Event names and status transitions are stable anchors for generated tests.
- Serialization protocol is centralized, reducing accidental wire incompatibilities.