MicaelleOS

Agents 101: Understanding the Fundamentals of AI Agents

Why you need to understand AI agents (even if you’re not a developer)

Recently, I started working as an AI specialist on an ambitious project: developing software that uses generative AI to render UI components directly on the user’s screen. It’s the kind of project that seems straight out of a futuristic story, but is becoming increasingly real.

However, while working with the team, I noticed something: despite generative AI being on the rise and widely discussed, many people still don’t understand the fundamentals of AI Agents. And that’s a problem.

Why? Because the entire team’s understanding is essential for the success of any AI solution. For example, functional professionals in particular are the ones who bring the real demand, understand customer needs, and define use cases that make a difference. Without knowledge of AI agent fundamentals, it’s difficult to design solutions that actually work.

With that in mind, I put together this post with an introduction to the basic concepts of AI agents, in accessible language, so that anyone can understand and contribute better to generative AI projects.

The Fundamental Equation: What is an AI Agent?

An AI agent can be defined in the following practical way:

AI Agent = LLM + Tools + Prompt (use case)

Each part of this equation is important, and we’ll break them all down below:

LLM is not the same as Agent

Here’s the first important revelation: an LLM (Large Language Model) alone is NOT an AI agent. Think of the LLM as a powerful brain that knows how to reason, plan, create structured texts, and answer complex questions. But, in the end, all this brain produces is just text.

It’s like having an extremely intelligent person locked in a room without a phone, without internet, without access to the outside world. They can think, analyze, create brilliant ideas… but they can’t act in the real world. An AI agent goes beyond: it can act.

In the early days of generative AI, when you chatted with ChatGPT and it only answered your questions, you were interacting with a simple LLM. But today, when you use an assistant that can schedule meetings on your calendar, send emails, query databases, and make decisions about which action to execute… then you’re interacting with an agent.

How LLMs Work: The Sophisticated Autocomplete

We can consider an LLM as if it were an extremely sophisticated autocomplete. Just like when writing a sentence on your phone, your keyboard suggests the next word, the LLM also does this, but on a much larger scale. The LLM has the ability to handle an absurd amount of text, identifying probabilistic relationships between words, and using this information to predict the next ones. All this taking into account what’s inside its context.

The Importance of Context

Here’s a secret many people don’t know: every time you send a message to an LLM via API (or chat), it needs to receive ALL the conversation history again to give you a response.

Let me illustrate with an example:

You: “What is the capital of France?”

LLM: “Paris”

You: “How many inhabitants does it have?”

To answer the second question, the LLM needs to receive:

Your first question (“What is the capital of France?”)

Its answer (“Paris”)

Your second question (“How many inhabitants does it have?”)

Only then can it understand that “it” refers to “Paris”. The LLM doesn’t have real memory — it simply receives the entire context again with each interaction. In fact, memory is controlled by the software that calls the LLM.

What’s the consequence of this?

This means that the LLM’s knowledge is limited to what’s inside its context window. Everything you put there will end up influencing the response in one way or another. That’s why it’s very important to be careful with what we send:

❌ Unnecessary information creates noise

❌ Disorganized data confuses the model

❌ Very large context increases costs and latency

✅ Lean but complete context generates accurate responses

Tools: What Makes an Agent Powerful

Tools allow the agent to take concrete and external actions, giving it the ability to interact with the real world based on its own judgment.

What can be a tool?

Practically anything that allows the agent to do something: query a database, search the internet, create documents, access external APIs…

How Tools Work in Practice

Here’s something important: the LLM doesn’t execute tools directly. It requests the software to execute a certain tool, at its own choice, and then return the execution result to it. The flow is like this:

The LLM analyzes the situation

The LLM requests the execution of a tool

The software you built executes the tool

The result is returned to the LLM

The LLM uses the result to continue reasoning

For the LLM to have the ability to judge, understand, and use a tool, it’s important that you describe the tools in natural language for the LLM. This way, it understands when and how to use them.

See a real example:

{
  "name": "fetch_customer_data",

  "description": "Fetches complete registration information of a customer
  by CPF. Returns name, email, phone, address, and account status.
  Use this tool when you need to verify customer data
  or when the customer requests information about their registration.",

  "input": {
    "cpf": "string with 11 digits (numbers only, no dots or dashes)"
  },

  "output": {
    "name": "string",
    "cpf": "string",
    "email": "string",
    "phone": "string",
    "address": "string",
    "status": "string (active/inactive)"
  }
}

The LLM reads this and understands:

What the tool does

When it should use it

What it needs to provide (CPF)

What it will receive back

This functional description is important because the agent decides on its own when to use each tool based on the situation. For this, it needs a clear understanding of how it can use the tool.

Agent Optimization Tips

Now that we understand the fundamental parts of an agent, let’s talk about one of the most important concepts: Context Engineering. Below is a fundamental principle:

If the agent needs to accomplish an important task, give it ALL the necessary tools and knowledge, formatted to make its work as EASY as possible, but nothing beyond that.

This may seem obvious, but unfortunately it’s where most AI projects fail.

The Essential Checklist

Every time you’re designing an agent, ask these three questions:

1️⃣ Does the agent have the necessary information within its context?

If it needs to know the company name, service hours, return policies… is this in the system prompt?

2️⃣ If not, does it have tools to fetch this information?

If it doesn’t know the customer’s history in advance, can it query the CRM? If it doesn’t know all the products, can it search the catalog?

3️⃣ Is the information formatted optimally?

Is data structured and are descriptions clear? Is there additional data that doesn’t matter for the task and might confuse the LLM?

By passing through this sieve, your agent has everything to succeed.

Benefits of well-done Context Engineering:

Drastically reduces costs

Well-done context engineering can reduce token consumption. For example: putting 200 tokens of essential information in the prompt prevents the agent from making 3-4 unnecessary searches of 1000+ tokens each.

Agent gets less lost

With clear context and adequate tools, the agent knows exactly what to do. No need to “guess” or try multiple approaches.

Faster and more accurate responses

With the right context, the agent gets it right the first time. Happy user, efficient system.

Conclusion

Now you have the fundamental knowledge about how AI agents work. If you’re working on AI projects, use this knowledge to:

Question agent designs

Contribute better to technical discussions

Propose more effective use cases knowing about possibilities and limitations

Generative AI is changing how we work. But to take advantage of it well, we need to understand its fundamentals.

About the author: Micaelle is an AI agents specialist at Accenture Song, working on solution architecture and multi-agent systems. She works with ML, multi-cloud environments (AWS Bedrock, GCP Vertex AI, Azure OpenAI) and frameworks like LangChain and LangGraph, delivering enterprise AI solutions for finance and telecom clients.

Top