Leaders Standard Work — The Operating System

For Agents

Agent Context

LEADERS STANDARD WORK — COMPRESSED CONTEXT

PURPOSE: Daily control rhythm for human+agent production systems.
         Ensures quality, surfaces problems, enables improvement.

FRAMEWORK: Signal / Learn / Operate
- SIGNAL: Review metrics and alerts. What does the data say?
- LEARN: Analyze root causes. What does it mean?
- OPERATE: Take action. What do we do about it?

DAILY RHYTHM (3 control windows):

1. AM SHIFT (~8:00 AM)
   - Signal: Review overnight runs, check quality scores
   - Learn: Identify any failures, note patterns
   - Operate: Set today's run plan, assign priorities

2. MIDDAY SHIFT (~12:00 PM)
   - Signal: Check morning run progress
   - Learn: Are we on track? Any blockers?
   - Operate: Adjust plan, escalate if needed

3. PM SHIFT (~5:00 PM)
   - Signal: Review day's results, finalize metrics
   - Learn: What worked? What didn't?
   - Operate: Close shift log, queue improvements

REQUIRED ARTIFACTS (every shift):
- Shift Log: timestamped record of observations and actions
- Run Plan: what will be produced, by whom, by when
- Improvement Ticket(s): captured learnings for system updates

SCOREBOARD METRICS:
- Speed: cycle time vs target
- Quality: QC pass rate (first-pass yield)
- Cost: resources consumed vs budget

ANDON TRIGGERS (stop and escalate):
- QC score <75% after 2 attempts
- Agent produces unexpected/off-topic output
- Context file missing or corrupted
- Run exceeds 2x expected duration

HARD RULE: Leaders review every shift. No exceptions.
           The daily rhythm IS the quality system.

Framework

What Is Leaders Standard Work?

Leaders Standard Work (LSW) is a structured daily routine that production leaders follow to maintain control, ensure quality, and drive improvement. It's borrowed from Lean manufacturing, where supervisors have defined checkpoints throughout the day rather than reactive firefighting.

In human+agent systems, LSW serves the same purpose: it creates predictable control points where humans review agent output, make decisions, and course-correct before problems compound.

Why This Matters

Without LSW, production becomes reactive. Problems are discovered too late. Quality degrades gradually. Improvement opportunities are missed. LSW creates the discipline that makes sustained excellence possible.

"Standard work is the baseline for continuous improvement. You cannot improve a process that is not standardized."

— Taiichi Ohno, Toyota Production System

Framework

Signal / Learn / Operate

Every shift review follows the same three-phase structure. This creates consistency and ensures nothing is missed.

Phase	Question	Activities	Output
Signal	What does the data say?	Review metrics, check alerts, scan logs	Observations noted in Shift Log
Learn	What does it mean?	Analyze patterns, identify root causes, assess risks	Insights captured, improvement ideas logged
Operate	What do we do about it?	Make decisions, assign actions, adjust plans	Run Plan updated, actions assigned

Order Matters

Always Signal before Learn, always Learn before Operate. Jumping straight to action without understanding the data leads to solving the wrong problems.

Daily Rhythm

AM Shift Review

Morning Control Window

~8:00 AM · 15-30 min

Signal

Review any overnight automated runs
Check quality scores from previous day's late runs
Scan for system alerts or errors
Review improvement tickets from previous shift

Learn

Identify any failures or quality issues
Note patterns across multiple runs
Assess capacity and resource availability
Flag any blockers or dependencies

Operate

Set today's Run Plan with priorities
Assign work to agents/team members
Escalate any issues from overnight
Open today's Shift Log entry

Daily Rhythm

Midday Shift Review

Midday Control Window

~12:00 PM · 10-20 min

Signal

Check progress against Run Plan
Review QC results from morning runs
Note any new alerts or anomalies
Check resource utilization

Learn

Are we on track to meet daily targets?
Any emerging blockers or risks?
What's causing variance from plan?
Are agents performing as expected?

Operate

Adjust afternoon priorities if needed
Escalate blockers that can't be self-resolved
Re-allocate resources if behind
Update Shift Log with midday status

Daily Rhythm

PM Shift Review

Evening Control Window

~5:00 PM · 15-30 min

Signal

Review all completed runs for the day
Finalize quality metrics
Collect all improvement tickets generated
Note any outstanding work in progress

Learn

What worked well today?
What didn't go as planned?
Are there systemic issues to address?
What patterns are emerging over time?

Operate

Close out today's Shift Log
Queue improvement tickets for batch review
Set up any overnight automated runs
Brief incoming shift (if applicable)

Controls

Daily Scoreboard

Three metrics tell you if the production system is healthy. Track them every day.

⚡

Speed

Cycle Time

✓

Quality

First-Pass Yield

$

Cost

Resources Used

Metric	What It Measures	Target	Red Flag
Cycle Time	Time from request to delivery	≤ target per deliverable type	> 2x expected duration
First-Pass Yield	% of runs that pass QC without rework	≥ 85%	< 70%
Resources Used	Tokens, compute, human hours	Within budget	> 20% over budget

Leading vs Lagging

First-pass yield is a leading indicator—declining yield predicts future problems. Cycle time and cost are lagging indicators—they measure what already happened. Watch yield most closely.

Controls

Andon Triggers

These conditions require immediate escalation. Stop work and involve a human decision-maker.

QC Failure Loop: QC score <75% after 2 regeneration attempts on the same deliverable
Off-Topic Output: Agent produces content unrelated to the assigned task
Missing Context: Context file is missing, corrupted, or contains invalid data
Duration Exceeded: Run takes >2x expected time without clear reason
Resource Overrun: Token/compute usage >3x expected for a single run
System Error: Agent crashes, hangs, or returns error repeatedly

Non-Negotiable

Andon triggers are not suggestions. When triggered, the leader must stop and investigate before resuming production. Do not "push through" hoping it will resolve itself.

Controls

Required Artifacts

Three artifacts must be maintained to ensure traceability and enable improvement.

Shift Log

A timestamped record of observations, decisions, and actions taken during each shift. The source of truth for what happened.

DATE: [YYYY-MM-DD] SHIFT: [AM / Midday / PM] LEADER: [Name] SIGNAL: - [Observation 1] - [Observation 2] LEARN: - [Analysis / Root cause] - [Pattern identified] OPERATE: - [Action taken] - [Escalation made] - [Plan adjustment] NOTES: [Any additional context]

Run Plan

The queue of work to be completed, with assignments, priorities, and target completion times. Updated at each shift.

DATE: [YYYY-MM-DD] UPDATED: [AM / Midday / PM] | # | Deliverable | Assignee | Priority | Target | Status | |---|----------------------|----------|----------|-----------|-----------| | 1 | [Package name] | Agent-1 | High | 10:00 AM | Complete | | 2 | [Package name] | Agent-2 | Medium | 2:00 PM | In Prog | | 3 | [Package name] | Agent-1 | Low | EOD | Queued | BLOCKERS: - [Any dependencies or issues] NOTES: [Adjustments from original plan]

Improvement Ticket

A captured learning or suggestion for system improvement. Generated during runs and batched for periodic review.

TICKET ID: [IMP-YYYYMMDD-###] DATE: [YYYY-MM-DD] SOURCE: [Which run / deliverable] SUBMITTED BY: [Name or Agent ID] OBSERVATION: [What happened or was noticed] SUGGESTION: [Proposed improvement] AFFECTED FILES: - [process.html / example.html / context.html / quality.html] PRIORITY: [High / Medium / Low] STATUS: [Open / In Review / Implemented / Rejected] RESOLUTION: [What was done, if any]

Roles

Roles & Accountability

Clear ownership prevents gaps and confusion. Every role has defined responsibilities.

Role	Responsibilities	Accountable For
Production Leader	Conducts shift reviews, maintains artifacts, escalates issues	Daily metrics, shift log completeness, escalation timeliness
Agent Operator	Configures and runs agents, performs QC, logs observations	Run execution, first-pass yield, improvement ticket submission
Process Owner	Maintains Process/Example/Context/Quality files, reviews improvements	Process quality, file currency, improvement implementation
Escalation Point	Receives andon triggers, makes judgment calls, resolves blockers	Response time, resolution quality, systemic issue identification

One Person, Multiple Roles

In small teams, one person may hold multiple roles. That's fine—what matters is that every responsibility has a clear owner. Ambiguity is the enemy.

"Management is doing things right; leadership is doing the right things."

— Peter Drucker

Leaders Standard Work bridges both: the structured rhythm ensures things are done right, while the Signal/Learn/Operate framework ensures the right things are prioritized.