Skip to content

Architecture

Design Goals

  • One workflow model for local and distributed execution.
  • Simple Python-first API (phrase, @lang.verb, lang.say).
  • Optional persistence and observability without forcing heavy infrastructure in unit tests.
  • Predictable transport semantics for non-JSON objects (Corpus, Reference, Pydantic models, bytes, Path).

Layered Model

  1. Definition Layer: lingo.phrase Defines lazy task graphs.
  2. Runtime Layer: lingo.celery.language.CeleryLanguage Dispatches graphs in local/distributed modes and tracks retries.
  3. Transport Layer: lingo.celery.channel and lingo.celery.protocol Encodes envelope metadata and serializes values safely across process boundaries.
  4. Persistence Layer: lingo.storage.mongo.MongoJournal Stores status, events, retry history, DAG metadata, and worker capabilities.
  5. Artifact Layer: lingo.bucket.archive.MinioArchive Handles larger payload/object references and signed URLs.
  6. Operator Layer: lingo.cli and lingo.moderator CLI workflows and test orchestration helpers.

Dispatch Lifecycle (High Level)

  1. Build phrase DAG using phrase(...).then(...).
  2. lang.say(...) assigns job id, persists initial state, and decides local vs distributed route.
  3. Payload serialized using serialize_value/serialize_phrase.
  4. Worker executes tasks and emits timeline events.
  5. Journal records status transitions and derived progress.
  6. Callers query status/result/progress via Journal APIs.

Why This Matters for AI Assistants

  • The boundaries between phrase definition, runtime dispatch, and persistence are explicit.
  • Event names and status transitions are stable anchors for generated tests.
  • Serialization protocol is centralized, reducing accidental wire incompatibilities.