AGENTSJournal

Agents are the new web apps. Architect them like web apps.

Every week someone asks us to build them "an agent" — usually imagined as a single prompt that does everything. Production agents look nothing like that. They look like the web apps you already know how to build.

Kushagra Kumar/Founder, Tech NerveMarch 2026·8 min

An agent that works in production is a state machine that happens to be driven by an LLM. Strip away the branding and you are left with familiar primitives: routing, authentication, validation, idempotency, retries, observability. If your instinct is to throw all of that into one megaprompt, your instinct is about to get you shipping a demo.

The three-layer pattern

Every reliable agentic system we ship sits in three layers. Confuse them and the system becomes impossible to debug.

1. The planner

Decides what to do next given the current state. This is where you use your best model. Output is a structured action — not prose, not chain-of-thought rambling. A JSON object with a tool name, arguments, and a stopping condition. The planner never touches the outside world directly.

2. The executor

Deterministic code. Takes the planner's action, validates it against the tool schema, checks permissions, executes it, captures the result. Handles retries, timeouts, and the hundred other things a production system has to handle. This is the layer your SRE trusts.

3. The memory

Structured state. Not a transcript. A record of what has been done, what has been tried, what succeeded, what failed — stored somewhere the planner can read it in the next iteration and somewhere a human can read it in the next post-mortem.

Tools are APIs. Treat them like APIs.

Typed schemas for every argument. Validation before execution, not after. Permissions scoped to the user the agent is acting on behalf of. Rate limits, because a runaway agent can burn a credit card faster than a misconfigured cron job. Idempotency keys, because LLMs will retry the same operation three times and call it initiative.

JSON schema for every tool, generated from a single source of truth.
Runtime validation that rejects the call — not the model's mistake.
Per-tool permission and per-user rate limits.
Audit log of every invocation, keyed to the session and the user.

Loops must be bounded

Agents that can call themselves will call themselves forever. Set an explicit max-iteration count, a max-cost budget, and a max-wall-clock. Any of them trips, you stop and hand back to a human. Unbounded loops are the single biggest production incident we see in first-generation agent systems.

“A well-behaved agent is one that knows when to give up. Pick the constraint that trips first — it will be your savings account.”

Observability is non-optional

Every planner decision, every tool call, every tool result, every token cost, timestamped, correlated to a session id. Langfuse, Arize, Braintrust — pick one, wire it in before you ship, not after. If you cannot answer "why did the agent do this on Tuesday at 3pm?" in under a minute, you have not shipped an agent. You have shipped a black box.

The boring conclusion

The agentic systems that ship are boring. They route with deterministic code, delegate reasoning to a constrained planner, execute with typed tools, observe everything, fail into a human. The "wow" of the demo moves into the margins; the product that remains is reliable. Build for the reliable thing. That is what the user pays for.

Tagged

Agents
LLM
Architecture
Engineering