et-agent — full system flow

Every agent, every model, every DB hop, every data shape. For the UK employment-law robo-lawyer at etagent.andyslab.uk plus its testing scaffold.

1. System topology

Three independent services in one tailnet, all on the same Hetzner host. Each has its own Postgres database. They communicate over HTTP.

┌─────────────────────────┐ HTTP (form submission) │ udcalc.unfair- │ ──────────────────────────┐ │ dismissal.uk (Vite) │ │ └─────────────────────────┘ ▼ ┌──────────────────────┐ ┌─────────────────────────┐ HTTP /chat, │ et-agent backend │ │ etagent.andyslab.uk │ /docs, /case, │ (FastAPI, uvicorn) │ │ Next.js + copilotkit │ ───────────────► │ systemd: et-agent- │ └─────────────────────────┘ │ api.service │ └─┬────────────┬───────┘ │ │ HTTP /search/cases/ │ │ SQL reranked, │ │ asyncpg /search/statutes │ ▼ │ ┌──────────────────┐ │ │ Postgres 16 │ │ │ db = et_agent │ │ │ 25 tables │ │ └──────────────────┘ ▼ ┌────────────────────────┐ │ et-pipeline API │ │ FastAPI on :8055 │ │ X-API-Key auth │ │ systemd: et-pipeline- │ │ api.service │ └─┬──────────────────────┘ │ SQL + pgvector ▼ ┌──────────────────────────┐ │ Postgres 16 │ │ db = et_pipeline │ │ judgments (164k rows) │ │ statute_sections (4.5k) │ │ citations (333k) │ │ HNSW indexes (voyage-4- │ │ large, 1024-dim) │ └──────────────────────────┘
ComponentTechPortPublic
udcalc calculatorVite/React + PostgRESTform.unfair-dismissal.ukyes
et-agent frontendNext.js + copilotkitetagent.andyslab.ukyes
et-agent APIFastAPI/uvicorn:8000 (tailnet) → caddyvia caddy
et-pipeline APIFastAPI:8055 (loopback) + et-pipeline.andyslab.ukAPI-key gated
et-pipeline workersasynciointernal

2. Models in use

RoleModelProviderWhy
Drafter (et1, demand, lbc, sar, grievance, settlement) claude-sonnet-4-5 Anthropic direct High-stakes legal prose. Long-context, citation discipline.
Critic (compliance, opposition, tribunal, tone, costs-trail, ACAS) claude-haiku-4-5 Anthropic direct Critics run 3–4× per draft; need to be cheap.
Scorer (rubric-based 0–100) claude-haiku-4-5 Anthropic direct Rubric eval is pattern-match; Haiku is enough.
Extractor (document_classifier, claim_extractor, inbound_*) claude-haiku-4-5 Anthropic direct Structured JSON from semi-structured text.
Verifier (fact_checker, consistency, legal_authority, red_team_accuracy_judge) claude-haiku-4-5 Anthropic direct Cross-references citations, dates, numbers; not generative.
Red-team simulator (strike_out, worst_case_award, opposing_counsel) claude-sonnet-4-5 Anthropic direct Adversarial reasoning needs depth.
Case agent (router + chat) claude-haiku-4-5 Anthropic direct Tool-calling + claimant-facing chat; speed matters.
Corpus embeddings (et-pipeline) voyage-4-large (1024-dim) Voyage AI Best legal-text recall in current benchmarks.
Corpus reranker voyage-rerank-2.5 Voyage AI Boosts precision on top-k from HNSW.
Corpus LLM extraction (legal field JSON) gemini-2.0-flash concentrate.ai $0.0006/row for 164k judgments; bake-off winner vs Haiku at 12× cheaper.
Test-case generator (our new tier) claude-haiku-4-5, gemini-2.5-flash, gpt-5-mini concentrate.ai Three families → narrative variance per seed lead.
Tier strategy: Sonnet for synthesis, Haiku for everything else. Opus is reserved for ad-hoc strategic agents (not in normal artefact production). Model IDs are pinned in et_agent/config.py; production uses Anthropic direct, but each prompt-call goes through a thin wrapper that records token counts + cost into case_llm_spend.

3. Ingestion: udcalc lead → et-agent case

A claimant fills the udcalc unfair-dismissal calculator. On submit, the lead lands in udcalc.leads (a separate database). When the claimant elects to engage et-agent for representation, a handoff is created.

Step-by-step

  1. Calculator submission — frontend POSTs to PostgREST /rest/v1/leads. Row created in udcalc.leads with structured fields (dates, pay, claim_types JSONB, claim_details JSONB, evidence_scores JSONB).
  2. Email handoff — claimant clicks "speak to a lawyer" → frontend POSTs to /functions/v1/sa-submit-lead (Deno). The Deno function calls Attio (CRM) and then et-agent's /intake/handoff endpoint.
  3. et-agent: POST /intake/handoff creates rows in:
    • udcalc_handoffs — raw payload + claim_types snapshot
    • cases — new case row, state=pre_engagement
    • claims — one row per claim_type, is_lead_claim set on the most-valuable one (heuristic: discrimination > whistleblowing > unfair dismissal)
  4. Engagement letter — operator (or claimant via portal) signs an engagement letter. State → engaged. A row in engagement_letters stores the signed PDF link + version.
  5. State → evidence_gathering — triggers the intake_completion ensemble.

Where ingestion talks to which DB

// HTTP
udcalc.uk frontend     →  PostgREST :5435  →  udcalc DB     (leads, sa_leads)
udcalc.uk frontend     →  Deno :8787       →  udcalc DB     (sa_submit-lead)
                                            →  Attio CRM       (deals/contacts)
                                            →  et-agent :8000  (handoff)
et-agent /intake/handoff →  asyncpg         →  et_agent DB  (cases, claims, udcalc_handoffs)

4. Intake completion ensemble

Once a case is in evidence_gathering, three triggers fire intake_orchestrator.run_intake_completion(case_id):

  1. Engagement signed (one-shot).
  2. Claimant sends a chat message (copilotkit on_complete).
  3. Claimant uploads/links a document (/evidence/{id}/link).

The orchestrator throttles to one full run per minute per case, then loads the snapshot + history and fires the intake_completion ensemble.

Ensemble: 4 prompts in sequence

cases row + claims + udcalc_snapshot │ ▼ ┌────────────────────────────────────────────┐ │ intake_gap_analyzer [haiku-4-5] │ │ Reads snapshot, returns list of missing / │ │ partial / wrong-format fields per claim. │ └────────────────────────────────────────────┘ │ ▼ ┌────────────────────────────────────────────┐ │ intake_readiness_gate [haiku-4-5] │ │ READY / NOT_READY based on gaps and rules. │ │ Bypasses question planner if READY. │ └────────────────────────────────────────────┘ │ NOT_READY │ READY → emit "intake complete" ▼ ┌────────────────────────────────────────────┐ │ intake_question_planner [haiku-4-5] │ │ Generates ≤3 questions for the claimant. │ │ Reads questions_asked to avoid repeats. │ └────────────────────────────────────────────┘ │ (claimant replies via chat) │ ▼ ┌────────────────────────────────────────────┐ │ intake_answer_integrator [haiku-4-5] │ │ Merges claimant answer into │ │ cases.udcalc_snapshot.gathered_facts. │ └────────────────────────────────────────────┘ │ loop until readiness_gate → READY

Inputs / outputs / DB touches

PromptInputOutputWrites to DB
intake_gap_analyzer snapshot, claim_types, fact-contracts table {gaps: [{field, status, claim_type}]}
intake_readiness_gate gaps + rules {readiness: READY|NOT_READY, blockers: [...]} cases.udcalc_snapshot.intake_status
intake_question_planner gaps, asked_already_json, last 3 messages {questions: [{text, target_field, why}]} cases.udcalc_snapshot.questions_asked append
intake_answer_integrator last claimant message, snapshot, target_field {updates: {field: value, ...}, confidence} cases.udcalc_snapshot.gathered_facts merge

5. Document handling

Claimants upload PDFs (dismissal letters, grievance correspondence, contracts, payslips, medical letters). Each upload runs:

  1. POST /documents stores the file in object storage (R2), creates a documents row, status=uploaded.
  2. Text extraction — pdfplumber over the bytes. The extracted text is stored as documents.raw_text.
  3. Classificationdocument_classifier prompt haiku-4-5 returns {kind: dismissal_letter|grievance|appeal_outcome|contract|payslip|..., confidence}. Writes documents.kind.
  4. Kind-specific extraction:
    • document_extractor for generic — pulls {parties, dates, key_facts, references}
    • grievance_outcome_extractor — structured outcome JSON (upheld, partially_upheld, rejected, with reasons + actions_offered)
    • lbc_response_extractor — employer's reply to the LBC, extracts {stance, amount_gbp, conditions, time_to_respond}
    • sar_response_extractor — confirms compliance with the SAR + any redactions challenged
    • inbound_classifier for emails → then one of inbound_acknowledgement_extractor, inbound_counter_offer_extractor, or inbound_info_request_extractor
  5. Evidence linking — operator (or autonomous rule) links a document to one or more evidence_items against specific pleaded claims. The link triggers a re-run of intake_completion.

All extractor prompts return strict JSON. Anthropic's tool_use mode with a Pydantic-typed schema is the wire-level format.

6. Artefact production pipeline

This is the core loop. It runs for six artefact types — defined in src/et_agent/artefact_configs/*.yaml:

The full pipeline (et1 example)

case_snapshot (frozen view of case at run time) │ │ fetch from et_pipeline via :8055: │ ┌──────────────────────────────────────────────┐ │ │ POST /search/cases/reranked → 6 binding + │ │ │ first-instance precedents (XML block) │ │ │ POST /search/statutes → 6 sections │ │ │ GET /judgments/{id} → optional │ │ └──────────────────────────────────────────────┘ │ + current_rates_block (statutory cap, Vento bands) │ ▼ ┌─────────────────────────────────────────────────────────────┐ │ DRAFTER claude-sonnet-4-5 │ │ et1_drafter (system) + et1_drafter_user (user template) │ │ Sampling: parallel_n_pick_best, n=3 │ │ • Three independent samples at temp=0.7 │ │ • Best of three picked by drafter-self-rubric │ └─────────────────────────────────────────────────────────────┘ │ │ draft_json = ET1Draft pydantic model │ writes draft preview into drafts table │ ▼ ┌─────────────────────────────────────────────────────────────┐ │ CRITICS (parallel) claude-haiku-4-5 × N │ │ et1_critic_compliance (blocking=critical) │ │ et1_critic_opposition (blocking=major) │ │ et1_critic_tribunal (blocking=major) │ │ Each: {issues: [{severity, location, claim, fix}], score} │ └─────────────────────────────────────────────────────────────┘ │ │ any critical issue? ──── yes ───► re-draft with │ no ▼ critique_block │ ▼ ┌─────────────────────────────────────────────────────────────┐ │ VERIFICATION GATE claude-haiku-4-5 │ │ fact_checker — every factual claim ↔ snapshot │ │ consistency_checker — internal contradictions │ │ legal_authority_checker — every citation ↔ et-pipeline │ │ Blocking on fabricated cites or snapshot mismatch. │ └─────────────────────────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────┐ │ RED-TEAM GATE (et1 only) claude-sonnet-4-5 │ │ strike_out_simulator — would the respondent get │ │ this struck out? │ │ worst_case_award_calculator — defensible floor £ │ │ opposing_counsel_responder — adversarial defence draft │ │ red_team_accuracy_judge — was the red-team valid? │ └─────────────────────────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────┐ │ SCORER claude-haiku-4-5 │ │ et1_scorer + et1_scorer_user + rubric_json │ │ Returns 0–100 with per-dimension breakdown │ └─────────────────────────────────────────────────────────────┘ │ ├─ score < soft_pass (60) → reject, re-run with new sampling ├─ score < hard_pass (85) → "soft pass", operator review └─ score ≥ hard_pass → mark ready_to_send

The whole pipeline is orchestrated by services/produce_artefact.py::produce_artefact().

Sampling strategies (per artefact)

ArtefactSamplingVerifier gateRed-team gate
et1parallel_n_pick_best (n=3)standardstandard
lbcsingle_shotstandardstandard
demand_lettersingle_shotnonenone
settlement_lettersingle_shotstandardnone
grievancesingle_shotstandardnone
sarsingle_shotnonenone

7. Critics and scorer

Critics are not gatekeepers; they are peer reviewers. Each runs in parallel against the same draft + same snapshot, returning structured findings. The orchestrator then decides:

Critic role catalogue

RoleWhat it checksUsed by
complianceACAS Code paragraphs cited correctly, statutory cap, time limits, format requirements.et1, demand, settlement
oppositionPlays employer's solicitor: where would they attack? Hopeless heads, time-limit gaps, missing particulars.et1, demand, lbc, settlement
tribunalPlays the EJ: clarity of pleading, prospect of strike-out applications surviving.et1
negotiationDemand anchored well? Concessions visible? BATNA implied?demand, settlement
costs-trailDoes the document set up a costs argument under r.76 / Cal? Records breaches?lbc
ACAS-complianceSpecifically: ACAS Code paragraph mapping for each procedural breach.grievance
toneCalm, factual, non-catastrophising. No invective.grievance
CalderbankWithout-prejudice flagging, costs-shifting language correct.settlement
strategicReads the offered amount; recommends accept / counter / reject.settlement_response

Scorer

Each scorer prompt has a paired _user template that bundles the draft + a per-artefact rubric (loaded from JSON). The rubric defines dimensions (e.g. for et1: pleading_clarity, statutory_citation, factual_completeness, prospect_of_strike_out, prospects_of_success) with weights. Output is JSON: {overall_score: int, dimensions: {name: {score, evidence_quote, fix}}, overall_summary}.

8. Verification gate

Three Haiku prompts run after critics pass but before red-team. All read the draft + the case_snapshot + corpus blocks. They DO NOT have the temperature to generate; they cross-reference.

PromptChecksPass criteria
fact_checker Every factual statement in the draft (date, name, pay figure, event) must trace to case_snapshot OR a referenced document in evidence_items. Zero unverified factual claims.
consistency_checker Internal contradictions: dates that disagree, parties named differently, money figures that don't reconcile across sections. Zero internal contradictions.
legal_authority_checker Every case citation must appear in comparable_cases_block; every statute citation must appear in relevant_statutes_block; no fabricated paragraph numbers in ACAS quotations. Zero fabricated cites.

Verifier failures block the artefact and trigger one re-draft with the verifier's findings injected as a critique. Two consecutive verifier failures → operator alert.

All verifier runs persist to verification table for audit.

9. Red-team gate

Only ET1 currently runs the full red-team. Four prompts:

  1. strike_out_simulator sonnet-4-5 — Plays a respondent's solicitor. Reads the draft + comparable_cases_block + foundational_law_block. Output: {vectors: [{ground, severity, predicted_outcome, supporting_authority}]}. Severities: high (likely to succeed), medium, low.
  2. worst_case_award_calculator sonnet-4-5 — Calculates the defensible floor award given the pleaded facts. Considers Polkey reductions, contributory fault, mitigation. Output: {floor_gbp, ceiling_gbp, rationale, deductions: [...]}.
  3. opposing_counsel_responder sonnet-4-5 — Drafts the response we expect from a competent employer solicitor. Used to find weaknesses we haven't pleaded around.
  4. red_team_accuracy_judge haiku-4-5 — Evaluates whether the red-team predictions were credible (e.g. "you predicted vector X but cited authority Y which actually doesn't support it"). Filters noise.

If strike_out_simulator returns ≥1 high-severity vector, the artefact is blocked and the drafter re-runs with the vectors in the critique_block. Stored in red_team_predictions.

10. et-pipeline corpus / RAG

et-pipeline is a separate project at /srv/active/et-pipeline that owns the UK employment-law corpus. It is the only system allowed to write to the et_pipeline database.

Corpus baseline (2026-05-13)

SourceEmbedded rowsNotes
bailii9,144Higher courts; full LLM extraction
findcaselaw24,848SC/CA/EAT/UT; full LLM extraction
govuk (ET first instance)130,35837,558 with LLM extraction, 92,800 raw-only
statute_sections1,273ERA 96, EqA 10, TULRCA 92, ER99, WPA23, WTR98
citations333,000+Citation graph between judgments

State machine (ingestion)

enumerated downloaded extracted (pdfplumber) llm_extracted gemini-2.0-flash embedded voyage-4-large

Govuk ET first-instance bypasses the LLM stage (raw_text → embed) — first-instance is persuasive only, semantic search on facts is sufficient. Higher courts get the full extraction because their reasoning + cited cases feed the citation graph.

Retrieval API (consumed by et-agent)

EndpointMethodInputOutput
/search/casesPOST{query, limit, courts?, binding_share?, jurisdiction_codes?}{results: [{id, case_name, court, decision_date, neutral_citation, source_url, extracted: {facts_summary, outcome, parties, ...}, cosine_score}]}
/search/cases/rerankedPOSTsame + reranker runsame + rerank_score, pre_rerank_position
/search/statutesPOST{query, limit}{results: [{id, act_id, act_title, section_number, section_title, body, score}]}
/judgments/{id}GETfull row inc. raw_text + extracted JSONB
/judgments/{id}/citesGETcitation graph outbound
/judgments/{id}/cited-byGETcitation graph inbound
/statutes/by-act/{act}/section/{n}GETexact statute section (used by foundational_law)
/healthGETrow counts

Tiered search (binding precedent ↑)

/search/cases splits its top-k candidates 60/40 between binding precedent (SC/HL/CA/EAT/UT/HC) and first-instance (ET). The binding pool feeds from a partial HNSW index idx_judgments_embedding_facts_binding over WHERE state='embedded' AND court IN ('SC','HL','CA','EAT','UT','HC'). Without this index, the 130k ET rows dominate post-filter and only ~9 binding cases survive per query.

The API's IN clause MUST literally match the partial index predicate (no parameterised arrays — planner can't prove subset). Constants live in retrieval/api.py::_BINDING_IN_CLAUSE.

11. Voyage embeddings — chunking, shape, storage

Two embeddings per judgment

ColumnSource textDimPurpose
embedding_full First 14,000 chars of raw_text (≈3,500 tokens; voyage-4-large cap is 16k/doc). 1024 Full-text semantic recall.
embedding_facts Compact text: case_name + extracted.facts_summary + outcome + legal_issues. For govuk raw-only rows, falls back to case_name + first 1000 chars of raw_text. 1024 Fact-pattern matching; used by the rerank pipeline. This is the index queried by /search/cases.

Chunking

No chunking. One row → two embeddings → one document in the result set. Voyage-4-large's 16k token window accommodates ~30-page judgments; longer ones are truncated to the first 14k chars (which captures the head facts, issues, and most of the reasoning).

For statute sections: each section_number is its own row in statute_sections. No further chunking; sections are already the natural retrieval unit.

Indexing

CREATE INDEX idx_judgments_embedding_facts ON judgments
  USING hnsw (embedding_facts vector_cosine_ops)
  WHERE state = 'embedded';

-- Partial: binding-precedent only (130x faster recall on binding subset)
CREATE INDEX idx_judgments_embedding_facts_binding ON judgments
  USING hnsw (embedding_facts vector_cosine_ops)
  WHERE state = 'embedded'
    AND court IN ('SC','HL','CA','EAT','UT','HC');

CREATE INDEX idx_statute_sections_embedding ON statute_sections
  USING hnsw (embedding vector_cosine_ops);

Retrieval flow

query string voyage-4-large embed 1024-vec pgvector HNSW top-30 voyage-rerank-2.5 top-6 reranked

_HNSW_EF_SEARCH = 1000 is set SET LOCAL per query for cases where the partial index can't be used (e.g. jurisdiction_codes filter).

12. Testing scaffold (what we just built)

The system above had no objective accuracy measurement. This new scaffold gives us one.

┌────────────────────────────┐ │ udcalc.leads (69 rows) │ postgres │ ──────────────────────── │ │ PII (name/email/phone) │ │ + structured facts │ │ + claim_types (JSONB) │ │ + claim_details (JSONB) │ └─────────────┬──────────────┘ │ │ scripts/export_udcalc_leads.py │ • PII scrub → synthetic identities (deterministic from UUID) │ • keep structured facts │ ▼ ┌─────────────────────────────────────┐ │ tests/fixtures/cases/_seed/ │ │ leads.jsonl (69 lines) │ └─────────────┬───────────────────────┘ │ │ scripts/generate_test_cases.py │ • 3 mid-tier models × 2 variants per seed │ • Variant 1: clean professional │ • Variant 2: distressed-human (typos, conflicts, │ emotional outbursts, cross-doc contradictions) │ • Single concentrate.ai call per case │ ▼ ┌─────────────────────────────────────┐ │ tests/fixtures/cases/*.json │ │ 342 / 414 cases (83% yield) │ │ Each: meta + case_snapshot │ │ + narrative │ │ + documents{dismissal, │ │ grievance, appeals, │ │ inbound_messages} │ │ + ground_truth labels │ └─────────────┬───────────────────────┘ │ │ scripts/test_prompt.py <prompt_name> │ • Load the prompt + its _user template │ • Resolve expected_variables per case: │ case_file_json ← case_snapshot │ comparable_cases_ ← et-pipeline /search/ │ block cases/reranked │ relevant_statutes_ ← et-pipeline /search/ │ block statutes │ current_rates_ ← static UK rates block │ block │ draft_json ← prior drafter run dir │ rubric_json ← rubrics/.json │ • Call concentrate.ai with prompt's model │ • Save output per case │ ▼ ┌─────────────────────────────────────┐ │ tests/runs/<prompt>/*.json │ │ One file per (prompt, case) │ │ {output_text, cost_usd, usage, │ │ rendered_user, ran_at} │ └─────────────────────────────────────┘

Generation cost ledger

ModelSuccessful casesFailures
claude-haiku-4-51371
gemini-2.5-flash8850 (JSON escaping in noisy variant)
gpt-5-mini11721
Total34272 (17%)

Total cost: $7.47 across 414 attempted, ~$0.018/case.

Next step (not built yet)

Eval / grader: read tests/runs/<prompt>/*.json + the corresponding tests/fixtures/cases/*.json::ground_truth → score each output against the labels → produce a per-prompt accuracy report (precision/recall on claim_types extracted, exact-match on dates, conflict-detection rate for fact_checker, citation grounding rate for legal_authority_checker, etc.).

13. Database tables

et_agent (25 tables)

TablePurpose
casesOne row per claimant case. udcalc_snapshot JSONB carries intake state.
claimsOne row per pleaded claim_type, FK to case. is_lead_claim flag.
clientsClaimant contact + KYC metadata.
engagement_lettersSigned engagement PDFs + version.
udcalc_handoffsRaw payload from udcalc form on case creation.
messagesChat messages claimant ↔ agent.
communicationsOutbound + inbound emails / posts. Links to documents.
documentsUploaded files. kind populated by document_classifier; raw_text by pdfplumber.
evidence_itemsLogical evidence (an event, a fact). Linked to documents + claims.
draftsEvery artefact draft + version. JSONB body, score, model used, cost.
ensemble_runsOne row per produce_artefact() invocation. Status, total cost, duration.
ensemble_run_stepsPer-prompt step inside a run: drafter, each critic, scorer, verifier, red-team. Input/output JSON, model, tokens, latency.
verificationVerifier prompt outputs (fact-check, consistency, legal-authority).
red_team_predictionsStrike-out vectors + worst-case awards + opposing counsel drafts.
correspondence_chainThreaded inbound/outbound correspondence per case.
case_eventsState transitions + operator actions audit log.
case_state_logAppend-only state machine ledger.
case_llm_spendPer-case cost ledger (rolled up from ensemble_run_steps).
corpus_queriesEvery retrieval call to et-pipeline + how the results were used downstream.
deadlinesTime limits + key dates (ACAS Early Conciliation, ET1 deadline, hearing).
operator_actionsManual operator overrides (forced state transitions, soft-pass approvals).
admin_impersonationsAudit log when operator impersonates claimant in chat.
model_pricingPer-model input/output $/M token table (synced from provider docs).
account / user / sessionBetterAuth standard tables.

et_pipeline (6 tables)

TablePurpose
judgments164k rows. raw_text, extracted JSONB, embedding_full + embedding_facts vectors, state machine column.
statute_sections1.3k rows. Each ERA 96 / EqA 10 / etc. section. embedding vector.
citationsCitation graph. (citing_id, cited_id, context).
employment_ratesStatutory caps + Vento bands by effective date. Hand-curated.
pipeline_runsIngestion runs telemetry.

14. Data formats

CaseSnapshot (canonical case shape)

Defined in et_agent/domain/case_state.py. This is what every drafter / critic / scorer sees as case_file_json.

class CaseSnapshot(BaseModel):
    case_id: str
    case_reference: str
    state: str

    claimant_name: str | None
    employer_name: str | None
    employment_status: str | None       # dismissed | constructively_dismissed | ...
    employment_start_date: date | None
    employment_end_date: date | None
    weekly_pay_gross: float | None
    annual_pay_gross: float | None
    age_at_termination: int | None
    country: str | None                  # england-wales | scotland | northern-ireland

    udcalc_narrative: str | None         # claimant's own story

    formula_compensation_low: float | None
    formula_compensation_high: float | None

    claims: list[ClaimSummary]      # claim_type, is_lead, value range

class ClaimSummary(BaseModel):
    claim_type: str
    is_lead_claim: bool
    estimated_value_low: float | None
    estimated_value_high: float | None

Prompt frontmatter

---
name: et1_drafter_user
description: User message template for the ET1 drafter
role: user
expected_variables:
  - case_file_json
  - comparable_cases_block
  - relevant_statutes_block
  - current_rates_block
  - critique_block
tags:
  - et1
  - drafter
version_notes: |
  Pure data envelope — tune wording sparingly; substantive guidance
  lives in et1_drafter.md.
---

CASE FILE:
$case_file_json

COMPARABLE CASES (cite only from this list):
$comparable_cases_block

RELEVANT STATUTE SECTIONS:
$relevant_statutes_block

$current_rates_block

$critique_block

Comparable cases block (rendered)

<comparable_cases>
  <case id="findcaselaw-eat-1234" court="EAT" date="2023-04-12">
    <name>Smith v Acme Ltd UKEAT/0123/23</name>
    <facts>Claimant dismissed after raising health-and-safety
      concerns. EAT held that the protected disclosure
      principle applied even though the disclosure was made
      informally...</facts>
  </case>
  ...
</comparable_cases>

Test case fixture (what we just generated)

{
  "meta": {
    "id": "claudehaiku45_lead_0543_v1",
    "source_lead_id": "<udcalc.leads.id uuid>",
    "generator_model": "claude-haiku-4-5",
    "generator_variant": 1,
    "cost_usd": 0.0298
  },
  "case_snapshot": { ...full CaseSnapshot dict... },
  "narrative": "I worked for Roslyn Care Foundation for nearly 28 years...",
  "documents": {
    "dismissal_letter": "Roslyn Care Foundation\n14 Ashford Lane...",
    "grievance_letter": "...",
    "grievance_outcome_letter": "...",
    "appeal_letter": "...",
    "appeal_outcome_letter": "...",
    "inbound_messages": [
      {"kind": "lbc_response", "from": "employer_solicitor",
       "received_at": "2026-03-14",
       "body": "...",
       "amount_gbp": 4500}
    ]
  },
  "ground_truth": {
    "expected_claim_types": ["unfair_dismissal"],
    "expected_lead_claim": "unfair_dismissal",
    "expected_acas_compliant": false,
    "acas_failures": ["no_appeal_offered", "predetermined_outcome"],
    "expected_outcome_category": "upheld",
    "expected_value_range_gbp": {"low": 14000, "high": 32000},
    "automatic_unfair_grounds": [],
    "time_limit_status": "in_time",
    "red_team_flags": ["procedural_unfairness"],
    "conflicts": [
      {"description": "narrative says March 2024, dismissal_letter dated April",
       "where": ["narrative", "documents.dismissal_letter"],
       "truth": "dismissal_letter date is correct (April)",
       "trips_prompts": ["fact_checker", "consistency_checker"]}
    ]
  }
}

Voyage embedding wire shape

POST https://api.voyageai.com/v1/embeddings
Authorization: Bearer pa-...
Content-Type: application/json

{
  "input": ["text 1", "text 2", ...],   # batch up to 8 docs
  "model": "voyage-4-large",
  "input_type": "document"               # or "query" for retrieval
}

→ {
  "data": [
    {"embedding": [0.012, -0.034, ..., 0.022], "index": 0},   # 1024 floats
    ...
  ],
  "usage": {"total_tokens": 12384}
}

Pgvector storage

judgments.embedding_full   vector(1024)   # nullable until embed worker fills
judgments.embedding_facts  vector(1024)
statute_sections.embedding vector(1024)

Cosine similarity query (pgvector idiom used by et-pipeline)

SELECT id, case_name, court, decision_date,
       embedding_facts <=> :query_vec AS cosine_distance
FROM   judgments
WHERE  state = 'embedded'
  AND  court IN ('SC','HL','CA','EAT','UT','HC')
ORDER BY embedding_facts <=> :query_vec
LIMIT  30;

Appendix — the 65 prompts

FamilyCountNames
Intake4 intake_gap_analyzer, intake_question_planner, intake_answer_integrator, intake_readiness_gate
Document handling9 document_classifier, document_extractor, claim_extractor, inbound_classifier, inbound_acknowledgement_extractor, inbound_counter_offer_extractor, inbound_info_request_extractor, grievance_outcome_extractor, lbc_response_extractor, sar_response_extractor
Drafters (system + user)14 et1_drafter(+_user), demand_letter_drafter(+_user), lbc_drafter(+_user), grievance_drafter, sar_drafter, settlement_drafter, settlement_response_drafter(+_user), case_strength_drafter(+_user)
Critics13 et1_critic_compliance, et1_critic_opposition, et1_critic_tribunal, demand_letter_critic_compliance, demand_letter_critic_opposition, demand_letter_critic_negotiation, lbc_costs_trail_critic, lbc_opposition_critic, grievance_acas_compliance_critic, grievance_tone_critic, settlement_calderbank_critic, settlement_negotiation_critic, settlement_response_critic_compliance/opposition/strategic
Scorers10 et1_scorer(+_user), demand_letter_scorer(+_user), grievance_scorer, lbc_scorer, case_strength_scorer(+_user), settlement_letter_scorer, settlement_response_scorer(+_user)
Verifier3 fact_checker, consistency_checker, legal_authority_checker
Red-team4 strike_out_simulator, worst_case_award_calculator, opposing_counsel_responder, red_team_accuracy_judge
Specialist8 case_agent (chat router), costs_warning_simulator, per_claim_particulars, strategic_seam_finder, sar_compliance_checker, intro_message, voice, foundational_law (resolver, not a prompt)