Building AI Agents That Actually Work

table of contents

A few years ago everybody wanted chatbots. Then everybody wanted prompt engineering. Then everybody wanted RAG systems. Then everybody wanted reasoning models. Now everybody wants agents. And honestly, I get it. If LinkedIn is to be believed, AI agents are simultaneously replacing software engineers, generating millions in revenue, curing diseases, making coffee, and probably solving world hunger somewhere between standups and quarterly planning meetings.

There is just one small problem.Most people cannot actually explain what an AI agent is.

Some people call a chatbot an agent. Some people call every LLM connected to an API an agent. And honestly, most of these definitions are not very useful. The word has become so overloaded that two people can have a thirty-minute conversation about agents and realize at the end they were talking about completely different things.

So before we talk about building AI agents, let's first agree on what one actually is.

What Actually Is An AI Agent?

Let's start with the simplest example possible. You open ChatGPT and ask: “What is the capital of France?” It replies: “Paris.” That's it.

No tools. No workflows. No planning. No execution. You asked a question and received an answer. Congratulations, you just had a conversation with a chatbot. You did not use an AI agent.

And this distinction matters because people constantly blur the line between chatbots and agents. A chatbot answers questions. An agent executes workflows. Those sound similar on the surface, but they lead to very different architectures and very different expectations.

For me personally, an AI agent is not simply an AI model connected to a few APIs. An AI agent is an AI personality equipped with context, skills, MCPs, tools, and workflows that allow it to accomplish a specific goal.

Notice something important. The goal comes first. Without a goal, you do not really have an agent. You have a chatbot wearing a fake mustache pretending to have a job. And yes, that mental image is now stuck in your head. You're welcome.

This is where many discussions about AI agents immediately go off the rails. People start talking about models. They start talking about memory. They start talking about tool calling. They start talking about which framework is currently trending on Twitter this week. Meanwhile nobody has actually defined what the thing is supposed to do.

Imagine hiring an employee and introducing him like this:

This is Steve.
What does Steve do?
He has access to Jira.
Okay, but what does he do?
He can use Sentry.
Great, but what is his job?
He has memory.

That conversation would sound ridiculous.

Yet somehow this is how many AI agents get designed. People become obsessed with capabilities before they understand responsibilities. They focus on what the system can access instead of what the system is accountable for.

And that brings us to the first major mistake people make.

‍

The Biggest Mistake People Make

Most people start with the agent. They should start with the workflow. And you might be thinking: “Isn't that basically the same thing?”. Well, not really. Imagine somebody comes up to you and says: “I want to build a Jira agent.” Cool. Why? What problem is it solving? What workflow is it improving? What responsibility does it have? What outcome are we actually trying to achieve?

Most of the time, nobody knows. The thought process starts and ends with: “Agents are cool, therefore I need one.” That is roughly equivalent to saying: “I want to build a microservice.” For what? Nobody knows. But apparently we're building it.

A much better starting point would be something like: “I want new Jira tickets analyzed automatically so engineers receive implementation suggestions before they start working.”

Now we have something useful. Now we have a goal. And once we have a goal, the architecture starts becoming obvious. The workflow determines the tools. The tools determine the MCPs. The MCPs influence the skills. The skills shape the agent.

Not the other way around. This is one of those ideas that sounds simple until you realize how often people ignore it. They spend weeks designing agents and almost no time defining workflows. Then they wonder why the system feels inconsistent, unreliable, or impossible to evaluate.

If there is one thing I want you to remember from this article, it is this, don't start by asking: “What agent should I build?” Start by asking: “What workflow am I trying to improve?” The answer to that question will determine almost every architectural decision that follows. Because AI agents are not magic. They are software systems. And software systems require architecture.

The Building Blocks Of A Useful Agent

If you read my previous article, some of these concepts will already sound familiar. That's because useful agents are usually built from the same core building blocks. The difference is not the components themselves. The difference is how we combine them and how well they align with the workflow we're trying to improve.

Let's walk through them one by one.

Context

Context remains one of the most important parts of any successful agent. Without context, every conversation starts from zero. The AI has no understanding of your architecture, coding standards, business rules, workflows, project structure, documentation, or design philosophy. Every interaction becomes a guessing game.

Think about hiring a senior engineer and refusing to answer any onboarding questions. You don't explain the architecture. You don't explain the business. You don't explain the team's conventions. Then a week later you get angry because they made bad decisions.

That's exactly how many people work with AI. Good context improves consistency. Bad context creates chaos. Now, at this point, some people take this idea and immediately decide the solution is to dump every document ever created into the context window.

Sounds reasonable, right? Unfortunately, not really.

More context does not automatically mean better context. At some point the signal gets diluted. Instructions start conflicting. Important information becomes harder to find. The AI becomes slower. Reasoning quality starts degrading. Your tactical soldier starts eating crayons. And that's usually a sign things have gone wrong.

Skills

Now let's talk about skills.

Skills are reusable operational capabilities. Instead of explaining the same workflow over and over again, we package knowledge into repeatable systems that can be reused across tasks.

Examples include:

Code review workflows
Security analysis
Documentation generation
Migration planning
API design reviews
Debugging procedures

Without skills, many AI conversations end up looking like this:

Please do the thing.
No, not like that.
Still wrong.
Why are you editing unrelated files?
Please stop touching production.

Skills dramatically reduce this chaos because they give the agent a repeatable way of approaching a problem. Instead of reinventing the process every time, the agent follows a known pattern. Consistency improves. Reliability improves. The amount of keyboard-throwing decreases significantly.

Which is good for both productivity and monitor longevity.

MCPs

Most people think MCPs exist to connect AI to tools like Jira or Sentry.And yes, they absolutely can do that. But personally, I think some of the most powerful MCPs are the ones that expand an agent's capabilities rather than simply exposing another API.

A great example is CodeGraph.

Instead of searching through a codebase using grep and pure optimism, CodeGraph creates a graph representation of your repository. Now the AI can understand relationships between files, classes, modules, and functions. Instead of blindly searching for text matches, it can reason about structure.

This is where MCPs become really interesting. Not because they connect AI to tools. Because they expand what the AI is capable of doing. The difference might sound subtle. It isn't. One gives access. The other gives capability. And capability tends to create far more leverage than access alone.

Tools

Tools are where agents stop being observers and start becoming participants. Without tools, an agent can only think. With tools, an agent can interact with systems, gather information, and execute parts of a workflow.

Examples include:

Read Jira tickets
Analyze Sentry issues
Search repositories
Review pull requests
Inspect documentation
Execute workflows

And this is a massive difference.

Because the most useful agents are not the ones producing the smartest sounding answers. They are the ones helping us get actual work done. A beautifully written explanation is nice. A workflow that saves your team ten hours every week is better.

The Memory Trap

Now let's talk about one of the most overhyped topics in the entire AI ecosystem. Memory. Every week somebody is building a revolutionary memory system. Every week somebody is storing everything. Every week somebody is putting every conversation into a vector database and hoping magic happens.

And honestly? Most agent memory implementations are terrible. Why? Because people assume more memory automatically creates a smarter agent. It doesn't. Imagine every thought you've ever had being permanently stored and injected into every future conversation. Every bad idea. Every outdated assumption. Every random thought from three years ago that made sense for approximately six minutes.

Sounds awful, right? That's exactly what many memory systems do. Over time memory becomes outdated, contradictory, irrelevant, redundant, and confusing. It slowly turns into the junk drawer everybody has somewhere in their home. Nobody knows what's inside. Nobody wants to clean it. Yet somehow new things keep getting thrown into it.

The biggest danger here is context poisoning. Bad memories create bad assumptions. Bad assumptions create bad decisions. Bad decisions create bad outputs.And the scary part is that the degradation is gradual. The system doesn't suddenly fail. There is no dramatic explosion. No alarms. No raccoon gaining access to production and deleting customer records.It just slowly becomes worse. Which makes it significantly harder to notice.

Example: A Sentry Analysis Agent

Before we continue, I want to make one thing clear. This is not a tutorial. I am intentionally simplifying these examples because the goal is to demonstrate workflows, not implementation details. Let's imagine we want to create a Sentry analysis agent.

The workflow could look something like this. Every morning the agent runs automatically. It reads newly reported Sentry issues, groups duplicate errors, analyzes stack traces, searches the codebase, identifies potentially related files, generates possible root causes, proposes potential fixes, and finally creates a report for the engineering team.

Notice something important. The agent is not fixing production, deploying code, or replacing engineers. The agent is helping engineers spend less time investigating. This distinction matters because many successful agents do not replace work. They reduce repetitive work. They remove friction. They accelerate workflows that humans already perform.

And those are often the highest ROI opportunities you'll find.Not because they're flashy. Because they're useful.

Example: A Jira Planning Agent

A Jira planning agent follows a very similar philosophy.

When a new ticket appears, the agent can:

Read the ticket
Understand the requirements
Analyze the codebase
Identify affected systems
Highlight risks
Suggest implementation approaches
Generate an implementation proposal

Now notice what it is not doing. It is not generating the entire PRD. It is not making architectural decisions. It is not deciding priorities. It is not determining business strategy. Those responsibilities still belong to humans.

The agent simply accelerates the early stages of the process. It gathers information, organizes context, identifies risks, and provides recommendations. The human remains responsible for strategic decisions.

And honestly, those early stages are often some of the most repetitive parts of software development anyway.

Research. Plan. Implement.

One framework I have found particularly useful is something called RPI

Research
Plan.
Implement.

Sounds simple.Because honestly, it is. But most people completely butcher it.What usually happens looks something like this:

Start implementing.
Context explodes.
Architecture drifts.
The AI hallucinates.
Everyone cries.

Instead, separate the workflow.Start with research. Use agents to gather information, analyze the codebase, identify constraints, understand dependencies, and build context around the problem.

Then move into planning. Define responsibilities, evaluate approaches, identify risks, compare tradeoffs, and break work into manageable pieces.Only then move into implementation.

This is where agents become incredibly powerful because implementation is usually the most tactical phase of the process. And AI excels at tactical work.

This is a theme you'll hear me repeat over and over again because I genuinely think it's one of the most important mental models for working with AI.The more strategic the decision becomes, the more human involvement should increase.

The more tactical the work becomes, the more leverage AI can provide. Remember the military analogy from my previous article. You are the strategist. The agents are tactical workers. Do not outsource strategic thinking. Outsource repetitive execution.

Automations

This is where things start becoming really interesting. Because now we can combine context, skills, MCPs, tools, and workflows with automations. Suddenly agents are not waiting for you to ask questions anymore. They can execute recurring workflows automatically and continuously.

Examples include:

Daily Sentry analysis
Jira ticket reviews
Documentation audits
Security reviews
Repository health reports
Pull request summaries

And once you start thinking this way, you realize something interesting.Most useful agents are not giant autonomous super-intelligences. They are well-designed workflows with clearly defined responsibilities, good context, reliable tooling, and sensible boundaries.

Not particularly sexy. But incredibly useful. And if we're being honest, useful usually pays the bills better than sexy architecture diagrams.

Final Thoughts

AI agents are currently surrounded by an absurd amount of hype. And honestly, some of it is deserved.They are genuinely useful. They can save time, automate repetitive work, improve engineering workflows, and create meaningful leverage for teams. But only when they are built properly.

Because at the end of the day, AI agents are not magic. They are software systems. And software systems require architecture. The examples we discussed here are just examples. A Sentry agent, Jira agent, planning agent and reporting agent.

The real limitation is not the technology.It is your ability to identify workflows worth improving.That's the skill that matters most. Not prompt engineering tricks. Not the latest framework. Not whichever AI startup just raised an alarming amount of money to reinvent task management.

The next useful agent probably will not come from asking: “What cool AI thing can I build?” It will come from asking: “What repetitive problem keeps wasting my time?”

Start there. Define the workflow. Understand the responsibility. Design the architecture around the goal. Everything else follows from that.

‍

Building AI Agents That Actually Work

What Actually Is An AI Agent?

The Biggest Mistake People Make

The Building Blocks Of A Useful Agent

Context

Skills

MCPs

Tools

The Memory Trap

Example: A Sentry Analysis Agent

Example: A Jira Planning Agent

Research. Plan. Implement.

Automations

Final Thoughts

SLA-Driven Architecture for Custom Software

Dual‑Run Migrations for Critical SDKs

Mobile Architecture Choices in Regulated Industries