Architecture Framework Production Blueprint · EVO3 2026

Agentic System Blueprint

A layered architecture framework for designing production-grade multi-agent systems — from memory and state to orchestration, tooling, and HITL governance.

Sierra Napier-Leach, MPA · EVO3 6 Architectural Layers

How to use this blueprint: Work through the six layers sequentially when designing a new agentic system. Each layer must be spec'd before the next is built. The layers are not independent — decisions in earlier layers constrain options in later ones. Anti-patterns at the end of each section flag common mistakes.

1 Identity & Scope Layer

Define exactly what each agent is, what it owns, and what it cannot touch. A well-scoped agent is easier to test, easier to monitor, and fails more gracefully.

Each agent must define:

A single primary responsibility (one verb + one noun: "classifies leads")
The specific data sources it may read from
The specific data targets it may write to
What triggers it (event, cron, API call, or upstream agent output)
What it outputs (structured schema, not free-form text)

Standard components:

agent_id responsibility triggers[ ] read_sources[ ] write_targets[ ] output_schema

Anti-pattern

An agent with more than two responsibilities is a service, not an agent. Split it. A "do everything for a new lead" agent that classifies, researches, and sends emails is three agents that should run sequentially.

2 Memory & State Layer

Agents need access to context that persists beyond a single invocation. Define explicitly what each agent needs to "remember" and where that memory lives.

Short-term memory

In-context: the current task input, the most recent N interactions for the current entity. Passed in the prompt. Lost after the call completes.

Working memory

Current pipeline state: lead status, pipeline_stage, pending actions, last_contact_at. Stored in your operational database. Queried at agent invocation.

Long-term memory

Historical interactions log, completed research summaries, preference signals. Stored in a searchable format (database or vector store). Selectively retrieved.

Anti-pattern

Passing the full interaction history in every prompt is expensive, slow, and often counterproductive. Select the most relevant context — not all context. Build a retrieval strategy, not a dump strategy.

3 Tool & Integration Layer

Agents accomplish real-world actions through tools. Each tool is a discrete, testable function with a defined interface. List every tool before building any of them.

Tool interface contract

name: verb_noun format (send_email, create_calendar_event)
inputs: strictly typed JSON schema
outputs: success payload or structured error
idempotency: is calling this twice safe?
reversibility: can this action be undone?
timeout: maximum allowed execution time

Common tool categories

read_data write_data send_email create_event search_web create_document post_notification escalate_human log_interaction

Anti-pattern

Non-idempotent tools without deduplication logic will cause double-sends, double-writes, and duplicate calendar events when agents retry on transient errors. Every side-effect tool needs an idempotency key.

4 Reasoning & Model Layer

The AI model is just one component of an agent — and often not the most important one. Choose model, context structure, and output format deliberately.

System prompt design rules

State the agent's role in the first sentence. Be specific.
Specify output format explicitly — JSON schema beats prose instructions.
Include a confidence field in the output schema.
Define what "I don't know" looks like in your output format.
Add explicit constraints: what the agent must never do.

Model selection guidance

Complex reasoning: Claude Sonnet / Opus, GPT-4o
Deep research: Kimi K2 (long context, web-grounded)
High volume / fast: Claude Haiku, GPT-4o-mini
Match model capability to task complexity — overpowered models on simple tasks waste budget and latency.

Anti-pattern

Treating the model as a magic oracle and the system prompt as an afterthought. The model's output quality is directly proportional to the specificity of its instructions. Vague prompts produce inconsistent, unpredictable agents.

5 Orchestration & Flow Layer

The orchestrator coordinates which agents run, in what order, and with what inputs. It is the spine of your multi-agent system.

Orchestrator responsibilities

Route events to the appropriate agent
Pass structured context between sequential agents
Handle errors without cascading failures
Enforce HITL gates between pipeline stages
Log every agent invocation and its outcome
Provide the system-wide kill-switch

Flow patterns

Sequential: Agent A → Agent B → Agent C (linear pipeline)
Parallel: Multiple agents running simultaneously on different tasks
Conditional: Agent A output determines which agent runs next
HITL-gated: Human approval required before the next agent runs

Anti-pattern

Agents that call each other directly (tight coupling). Always route through the orchestrator. Direct agent-to-agent calls create circular dependency risks, make logging nearly impossible, and produce cascading failures that are difficult to debug.

6 HITL Governance Layer

Human oversight is an architectural layer, not a bolt-on. Design it at the beginning, at the orchestration level, before implementing any agent.

HITL gate types

Synchronous approve: Agent halts and waits for human decision
Async draft review: Agent drafts output, human reviews before sending
Notification only: Agent acts, human is informed immediately
Exception escalation: Agent acts autonomously, flags anomalies for review

Required governance infrastructure

Admin dashboard with agent activity visibility
HITL review queue with SLA enforcement
Immutable interaction log with human attribution
Override rate monitoring and alerting
Non-technical kill-switch with immediate effect

Anti-pattern

Adding HITL as an email notification after the fact. "We'll send an alert if something goes wrong" is not oversight — it's incident response. True HITL means human judgment is embedded in the execution path before irreversible actions occur.

Want us to design your architecture?

EVO3's agentic system design engagements typically run as 2-week sprints — producing a complete architecture spec, agent definitions, and HITL governance framework before a line of code is written.

Start the Conversation