REFERENCE ARCHITECTURE · MAY 2026

The Secure Agentic Stack.

A defensible architecture for deploying autonomous AI agents in regulated environments. Built by operators who run it in production.

// THE PROBLEM

Stateless agents.

Every session starts from zero. Users re-explain context. Token costs spiral. The agent never gets smarter at its job.

Security veto.

Private data access plus untrusted content plus external API calls plus persistent memory equals the exact attack surface security teams are now blocking. No security signoff equals no production deployment.

Vendor lock-in.

The wrong agent runtime, memory layer, or model provider today is the migration tax tomorrow. Architecture decisions compound.

// THE STACK

Every action crosses four security perimeters before it touches your network. Inside the runtime, work splits across nine specialist roles, one per body part. Same architecture, different data per industry.

Cloud TenantIdentityRBACSIEMPolicyData GovernanceObservabilityCost

Private Network / VNet

Private endpoints at every data plane boundary

OpenShell / Sandboxfilesystem jailnetwork policycredential brokertool approvalauditintent verification

Agent Runtime

L0

Brain

L1

Hands

L2

Heart

L3

Session

L4

Badge

L5

Mouth

L6

Library

L7

Manager

L8

Receipt

Private endpoints: model APIs, storage, secrets, databases, search, queues, telemetry.

// SPECIALIST ROLES

LayerBody PartWhat It DoesWhy It Matters
L0BrainReasoning and inference. Routes by data sensitivity.The decision-maker. Smart enough to know when to ask a smaller, local model instead of a frontier one.
L1HandsTool use, MCP servers, external APIs, functions.Where the agent does things. Books meetings, files forms, sends emails, runs code.
L2HeartAgent loop. ReAct cycles, planning, tool selection.The pulse. Reasons, picks a tool, observes the result, repeats until the task completes.
L3SessionShort-term context, working memory.The current conversation. State that persists during the session and gets archived at session end.
L4BadgeAgent identity, RBAC, scopes, capability policies.The agent ID card. Defines what it can touch, which secrets it can use, which client tenant it serves.
L5MouthUser-facing surfaces. Trust-boundary aware sessions.How the agent talks to humans and other systems. Knows the difference between main session, DM, and group chat.
L6LibraryLong-term memory with entity resolution.The accumulated knowledge that turns a stateless agent into a worker who learns the job over time.
L7ManagerMulti-agent orchestration, queues, scheduled jobs.The conductor. Coordinates specialists across parallel workflows so nothing falls through the cracks.
L8ReceiptPer-agent observability and audit trail.Every prompt, response, tool call, and decision logged. Audit-ready, replayable, exportable to your SIEM.

// COMPONENT CHOICES

Each layer has multiple proven options. Here are the defaults we run in production today, and the alternates we test against. Every layer is swappable.

Agent Runtime layer

  • Vendor option A: Letta, full runtime
  • Vendor option B: OpenClaw, open source
  • TAG AI default: JARVIS, custom, 21 named agents

Specialist agents work as a coordinated team, not a single bloated prompt.

Sandbox and Policy layer

  • Vendor option A: NeMo Guardrails, NVIDIA
  • Vendor option B: Custom container hardening
  • TAG AI default: NemoClaw pattern plus OpenShell

Compromised prompts cannot escape the sandbox. Egress to unapproved domains gets blocked at the kernel level.

Memory Engine layer

  • Vendor option A: Mem0, largest ecosystem
  • Vendor option B: Supermemory, lowest latency
  • TAG AI default: Hindsight plus Pinecone plus Supabase

30 to 40 percent lower token costs. Agents that get measurably better month over month.

Observability layer

  • Vendor option A: LangSmith
  • Vendor option B: Helicone or Phoenix
  • TAG AI default: Langfuse plus Sentry

Every prompt, response, and tool call is traceable and replayable when something goes wrong.

Model and Infrastructure layer

  • Frontier APIs: Claude, GPT, Gemini
  • Local and regulated: Nemotron, Ollama, vLLM
  • TAG AI default: Hybrid sensitivity routing

Sensitive data never leaves your network. Best-in-class model performance for everything else.

// HOW IT FLOWS

For your security review, here is exactly what crosses each boundary, every layer touched, every log generated.

01User sends messageMouth, L5Auth event in identity provider
02Identity and scope validatedBadge, L4RBAC decision logged
03Sandbox boundary enforcedOpenShellNetwork policy decisions, capability check
04Long-term context retrievedLibrary, L6Query trace logged
05Sensitivity classifiedBrain, L0Model routing decision logged
06Agent loop reasons and actsHeart plus HandsReAct trace, tool calls captured
07Response generatedReceipt, L8Full prompt, response, cost logged
08Memory updated for next sessionLibrary, L6Memory write event logged

// DEFENSIBILITY

Compliance

Every action crosses four trust boundaries. Every decision lands in your SIEM. Audit responses go from days to seconds.

No lock-in

Every layer is swappable. New frontier model? Update the Brain. Better memory framework? Swap the Library. The body metaphor is the abstraction that lets each part evolve.

Production tested

We run this stack on our own business: E-Rate consulting, real estate operations, sales pipelines. We deploy what we depend on.

Architecture reviewed. Stack validated. Ship it.

If you are evaluating AI agent deployments and your security team has questions you cannot answer yet, that is the conversation we are built for. Operator grade architecture. Production ready in weeks, not quarters.

Book an architecture review