March 2026

The Three-Layer Model: Tools, Prompts, and Code

Why Most AI Implementations Fail at Architecture

Most enterprise AI deployments fail because they either provide no orchestration guidance or over-engineer rigid workflows that eliminate Claude's flexibility. The pattern is consistent: consultants deliver 120-page AI strategy reports that sit in drawers while teams resort to unstructured tool-calling that produces unreliable results.

Two architectural extremes dominate failed implementations. The first is "AI adoption theater" — organizations spend $500,000 on consulting reports that recommend comprehensive AI platforms without addressing specific business workflows. These reports identify opportunities but provide no practical implementation path. Teams are left throwing MCP tools at Claude with no structure, hoping it will figure out complex business processes through trial and error.

The second extreme is rigid orchestration frameworks that lock workflows into predetermined paths. Teams build elaborate state machines that remove Claude's reasoning capabilities, treating it as a glorified API router. According to the Anthropic Economic Index, 73% of enterprise AI failures stem from architectural decisions made in the first 90 days of implementation.

Generic "AI platform" solutions fail in mid-market contexts because they cannot encode the specific business knowledge that transforms generic tools into useful automation. The missing piece is not more technology — it is a clear separation of concerns between system operations, business logic, and execution control.

Layer 1 - Tools Are Atomic Operations

Tools are function calls with good names, clear descriptions, typed parameters, and filtered outputs. They form the atomic operations that Claude can compose into workflows, handling single, well-defined operations that talk to backend systems and return structured data.

An MCP tool does one thing: find_customer queries the CRM, get_invoice retrieves a specific invoice from accounting, check_project_budget returns current budget status. Each tool is deliberately limited in scope. Tools contain no business logic, no knowledge of other tools, and no awareness of user intent. They are pure interface adapters between Claude's function-calling mechanism and backend system APIs.

Tool quality depends entirely on metadata, not code complexity. A perfectly functional tool with a vague description will never be called correctly. A poorly-named tool will be invoked at inappropriate times, causing workflow failures. The tool description must specify exactly when to use it, what parameters are required, and what the output structure contains.

Well-designed tools follow the adapter pattern — thin translation layers that convert Claude's structured requests into backend API calls without adding logic. For example, a NetSuite integration tool receives a customer name from Claude, queries NetSuite's customer API, and returns structured customer data. It does not decide whether the customer is valid for a particular business process.

According to the Model Context Protocol specification, tools should be stateless and side-effect-free when possible. Read operations are inherently safer than write operations, and most business workflows start with data gathering before moving to actions that modify systems.

Layer 2 - Structured Prompts Handle Business Logic

Structured prompts are reusable instructions that tell Claude which tools to use, in what order, with what logic, and what the output should look like. They transform generic tools into business-specific automation by encoding the knowledge that currently lives in team members' heads.

This is the missing middle layer that most teams either skip entirely or over-engineer. Skipping it means throwing tools at Claude with no guidance, producing inconsistent results. Over-engineering it means building rigid workflow engines that remove Claude's adaptive reasoning capabilities. Structured prompts provide the right balance — enough structure to ensure consistent business logic, enough flexibility for Claude to handle edge cases.

A structured prompt for invoice validation prescribes the sequence: extract vendor and amount, check approved vendor list, verify budget availability, determine appropriate approver based on amount thresholds, and generate approval summary. The prompt encodes business rules like "invoices over $25,000 require CFO approval" and "construction materials must include project code."

These prompts are stored assets that teams version, test, and iterate on. They represent encoded institutional knowledge — the procedures that experienced team members follow but that are rarely documented formally. Building a library of structured prompts is the primary deliverable in most Tenon engagements.

Storage options include dedicated prompt management systems, version-controlled templates, or simple database records. The key requirement is that prompts can be updated without code deployment, enabling business users to refine workflows as processes evolve.

Layer 3 - Code Orchestration for High-Stakes Workflows

Code orchestration explicitly controls the sequence of operations when stakes are too high to trust probabilistic reasoning. Claude participates in specific reasoning steps but never makes rule-based decisions about workflow execution or business compliance.

In code orchestration, your application controls the deterministic business rules while Claude handles language understanding and generation. For example, in invoice processing, code always checks the approved vendor list — this is not a decision delegated to Claude. Claude extracts the vendor name from the invoice text and summarizes findings, but code enforces the compliance check.

This pattern eliminates the small but non-zero probability that Claude might skip a required step or misinterpret a business rule. When processing 200 invoices monthly, even a 1% error rate means two misrouted approvals. Code orchestration ensures deterministic execution of critical business logic.

async def validate_invoice(invoice_text: str, project_id: str): # Claude extracts structured data from unstructured text extraction = await claude.extract( invoice_text, schema={"vendor_name": str, "amount": float, "description": str} ) # Code enforces business rules deterministically vendor = await netsuite.find_vendor(extraction.vendor_name) if not vendor: return f"Vendor '{extraction.vendor_name}' not found in approved list" if extraction.amount > 25000: approver = "CFO" else: approver = "Project Manager" # Claude synthesizes findings with business context summary = await claude.summarize( vendor=vendor, amount=extraction.amount, approver=approver ) return summary

The boundary is clear: Claude never decides whether to check vendor approval or which approver to route to. Those decisions are handled by deterministic code that cannot fail or vary based on model reasoning.

How the Three Layers Work Together in Practice

The layers aren't mutually exclusive — a mature deployment uses Layer 1 for exploration, Layer 2 for repeatable processes, and Layer 3 for high-stakes workflows involving money or compliance. Teams typically deploy 60% Layer 1 usage, 30% Layer 2, and 10% Layer 3.

Layer 1 tools provide the foundation for all higher-layer patterns. A robust set of MCP tools enables both structured prompt orchestration and code-controlled workflows. Teams start with Layer 1 for ad-hoc data exploration, then graduate successful patterns into Layer 2 structured prompts for repeatability.

Layer 2 structured prompts reference Layer 1 tools by name, creating reusable workflows that business users can understand and modify. When a structured prompt workflow proves critical to operations, it migrates to Layer 3 code orchestration for guaranteed execution.

The Stakes Framework provides the decision model for layer selection. Low-stakes exploration uses Layer 1 tools directly. Medium-stakes repeatable processes use Layer 2 structured prompts. High-stakes workflows involving financial transactions, compliance requirements, or irreversible actions use Layer 3 code orchestration.

Migration between layers is common and expected. Teams often prototype workflows in Layer 1, standardize them in Layer 2, and harden critical paths in Layer 3. This progression allows rapid experimentation while ensuring production reliability.

Implementation Patterns and Common Mistakes

Most teams either over-engineer by jumping to Layer 3 code orchestration too early or under-engineer by staying in Layer 1 without building reusable Layer 2 assets. The key is graduating through layers as complexity and stakes increase rather than choosing one layer for all use cases.

The most common implementation mistake is premature Layer 3 adoption. Teams build complex orchestration code for workflows that could be handled reliably by structured prompts. This creates unnecessary maintenance overhead and reduces the flexibility that makes Claude valuable. Layer 3 should be reserved for workflows where business rules must be enforced deterministically.

The second common mistake is staying in Layer 1 exploration mode without building institutional knowledge. Teams use tools effectively for individual queries but never codify successful patterns into reusable structured prompts. This prevents scaling beyond individual power users.

Successful implementations start with a solid MCP tool foundation in Layer 1. Teams spend 2-3 weeks building comprehensive tools for their core business systems — NetSuite, Procore, Salesforce, or equivalent. Only after tools are stable do they begin building Layer 2 structured prompts for common workflows.

Version control strategies differ by layer. Layer 1 tools follow standard software development practices with Git repositories and CI/CD deployment. Layer 2 structured prompts need business-user-friendly versioning, often through database records with approval workflows. Layer 3 code orchestration returns to software development practices.

Testing approaches also vary by layer. Layer 1 tools need unit tests for API integration correctness. Layer 2 structured prompts need end-to-end tests with real business scenarios. Layer 3 code orchestration needs both unit tests for business logic and integration tests for Claude interaction.

When This Model Breaks Down

The three-layer model handles 90% of enterprise AI scenarios effectively. For the remaining 10%, you'll need domain-specific modifications while preserving the core separation of concerns between tools, business logic, and execution control.

Multi-agent scenarios represent the most common exception. While rare in mid-market contexts, some workflows require multiple Claude instances with different specialized prompts working on parallel tasks. The three-layer model extends to multi-agent patterns by treating each agent as a separate Layer 2 or Layer 3 workflow with shared Layer 1 tools.

Real-time systems with strict latency requirements may need compressed layer architectures. Voice interfaces or live chat systems cannot afford the multi-turn overhead of structured prompts. These systems often collapse Layer 2 logic into highly optimized single-turn prompts or push more logic into Layer 3 code for speed.

Compliance frameworks in highly regulated industries sometimes require additional architectural layers. Financial services or healthcare organizations may need audit logging, approval workflows, or data lineage tracking that doesn't fit cleanly into the three-layer model. These requirements typically add governance layers above the core architecture.

Integration with existing enterprise orchestration systems — like workflow engines or BPM platforms — may require hybrid architectures. The three-layer model becomes a component within larger enterprise systems rather than the complete solution architecture.

Ready to Implement the Three-Layer Model?

The architectural decisions you make in the first 90 days of implementation determine whether your AI deployment succeeds or joins the 73% that fail due to poor structure. Questions about applying the three-layer model to your specific business context? Reach out to discuss your architecture decisions.

← Back to Field Notes

Questions about what you've read?

Reach out