Skip to main content

How agents think and work

Understand the reasoning process behind autonomous agents and learn how they plan, execute, and adapt their work.

Updated over 3 weeks ago

Now that you've created your first agent, you might be curious about what actually happens when an agent executes a task. Understanding how agents think and work will help you create better agents, set appropriate expectations, and troubleshoot situations when results don't quite match what you wanted.

When you watch an agent work in Ubby, you'll see two parallel views that together tell the complete story of what's happening. On the left side, you have the chat where the agent explains its thinking and progress in natural language. On the right side, you have the Agent's Computer, a real-time window into every action the agent takes. This dual view is like watching both someone's thought process and their hands at work simultaneously.


Understanding the Agent's Computer

The Agent's Computer is your window into the agent's actual work. While the chat on the left tells you what the agent is thinking about doing, the Agent's Computer on the right shows you what the agent is actually doing right now, in real-time.

When an agent starts working on your task, the Agent's Computer opens automatically. You'll see it organize its work into a structured task list. Each task represents a distinct piece of work that needs to be accomplished, and you can watch as the agent moves through this list systematically.

The beauty of the Agent's Computer is transparency. You're not waiting in the dark wondering if anything is happening. You can see exactly which task the agent is currently executing, which tools it's using, and what results it's getting. This visibility builds trust because you understand what's happening at every moment.

As the agent works, everything it produces gets stored in the task files. Every document created, every piece of data extracted, every intermediate result gets saved automatically. This means you can access any part of the agent's work, not just the final output. If the agent creates a draft before producing the final version, both are available to you. If the agent gathers data before analyzing it, you can see the raw data too.


How agents plan their work

The very first thing an agent does when you give it a task is create a plan. This planning phase is visible in both the chat and the Agent's Computer, and it's fascinating to watch because you can see the agent's intelligence at work.

When you ask an agent to create a presentation about autonomous AI agents, the agent doesn't immediately start making slides. Instead, it pauses to think through what this request actually requires. You'll see in the chat that the agent explains its understanding of your request, and then in the Agent's Computer, you'll watch as it builds a structured task list.

The agent breaks down your high-level request into concrete, actionable tasks. For the presentation example, you might see the agent create tasks like analyzing the topic to determine key themes, outlining the presentation structure, choosing an appropriate visual style, creating individual slides for each major point, and finally reviewing the complete presentation for consistency.

What makes this planning phase powerful is that the agent figures out these steps by understanding your goal, not by following a template. If you had asked for a different type of presentation or specified different requirements, the agent would create a completely different task list. The planning adapts to what you actually need.

In the Agent's Computer, you'll see this task list take shape with clear structure. Tasks are often grouped into logical phases like planning, creation, and finalization. Each task shows its status: waiting to start, currently in progress, or completed. As the agent works, you can track progress through this task list in real-time.

The task list also shows you how many tasks the agent has created and how many it has completed. This progress tracking helps you understand how far along the work is and how much remains. For a complex task that generates twelve subtasks, you can watch the agent move from zero of twelve completed to twelve of twelve completed, seeing exactly where effort is being invested.


The difference between general agents and custom agents

Understanding how agents plan reveals an important distinction between general agents and custom agents, and this difference significantly impacts how tasks get executed.

When you use a general agent, the one you interact with directly without any special configuration, that agent has complete autonomy over planning. You describe what you want, and the agent decides entirely on its own how to break down the work, which tasks to create, and in what order to execute them. The agent analyzes your request and builds whatever task structure makes sense for accomplishing your goal.

This autonomy is incredibly powerful because it means the agent can handle requests it has never seen before. You're not limited to predefined workflows. The agent thinks through each unique situation and creates an appropriate plan. However, this also means some variability. Ask the same general agent to do the same thing twice, and it might create slightly different task lists each time, though both accomplish the goal.

Custom agents work differently. When you create a custom agent and define workflows in the configuration, you're taking control of the planning phase. Instead of letting the agent decide how to break down work, you specify exactly which tasks should happen and in what sequence. The workflow you define becomes the task list that the agent follows.

This controlled planning is valuable when you have a specific process that must be followed consistently. If your weekly email summary always needs to happen in exactly the same sequence of steps, creating a workflow ensures that consistency. Every time the agent runs, it executes precisely the tasks you defined in precisely the order you specified.

The trade-off is flexibility. A custom agent with a defined workflow cannot adapt its task structure to variations in your request. It follows the workflow you created. This is perfect for standardized processes where consistency matters more than flexibility, but it means you need to think carefully about what workflow to define.

Most users start with general agents to explore what's possible and understand how agents naturally break down different types of work. Once they identify a task that needs to happen repeatedly with consistent steps, they create a custom agent with a defined workflow to ensure that consistency.


Watching agents execute tasks

Once the planning phase completes, the agent begins executing tasks one by one. This execution phase is where you really see the agent's capabilities in action, and the Agent's Computer makes every step visible.

As the agent starts each task, you'll see the task status change in the Agent's Computer. The currently active task is clearly highlighted so you always know where the agent's attention is focused. In the chat, the agent explains what it's about to do, and then in the Agent's Computer, you watch it actually happen.

The agent selects appropriate tools for each task based on what needs to be accomplished. If the task requires creating a document, the agent uses document creation tools. If the task involves searching for information, the agent uses web search tools. If the task needs data from a connected service like Gmail, the agent uses the appropriate integration tools.

What's remarkable is that you can see the agent make these tool choices in real-time. The Agent's Computer shows you which tool the agent is invoking, what parameters it's providing to that tool, and what results come back. This visibility helps you understand both what the agent is doing and how it's using the capabilities you've granted it.

Every action the agent takes produces some result, and those results get stored in the task files that you can access at any time. When the agent searches the web and finds information, that information is saved. When the agent creates a slide, that slide is saved. When the agent processes data, both the raw data and the processed results are saved. Nothing disappears. The complete trail of the agent's work is preserved.

This comprehensive file storage serves multiple purposes. Obviously, you need access to the final outputs of the agent's work. But having access to intermediate results is equally valuable. If you want to understand why the agent made certain choices, you can look at the data it was working with. If you want to reuse part of the agent's work in a different context, you can extract just that piece from the task files.


How tool selection shapes agent capabilities

The performance and capabilities of your agent are directly determined by which tools you give it access to. This connection between tools and capabilities is fundamental to understanding what agents can and cannot do.

Think of tools as an agent's skill set. A human worker with carpentry skills can build furniture, but that same worker cannot perform surgery without medical training. Similarly, an agent with document creation tools can create presentations and reports, but that same agent cannot send emails without access to email tools.

When you create an agent and select which tools to enable, you're defining the boundaries of what that agent can accomplish. If you enable the full set of default tools plus Gmail integration and Google Drive integration, your agent can search the web, create documents, send emails, and manage files. That's a broad skill set suitable for many knowledge work tasks.

However, more tools create more complexity for the agent. Each additional tool the agent has access to is another option it must consider when planning and executing work. An agent with access to one hundred different tools must analyze a much larger possibility space than an agent with access to ten tools.

This is why tool selection matters for performance. An agent with exactly the tools it needs for its specific purpose can work more efficiently and make better decisions than an agent drowning in irrelevant options. When you create a custom agent for weekly email summaries, you don't need to give it access to image editing tools or spreadsheet tools. Limiting the agent to just Gmail and Google Drive tools keeps it focused and effective.

The quality of the agent's decisions also correlates with the language model powering the agent. More capable language models make better choices about which tools to use and when. They better understand the nuances of your requests and the subtle differences between similar tools. When you select which language model to use for an agent, you're choosing the intelligence level that drives tool selection and task execution.

Your own request clarity matters too. When you give an agent a clear, specific task description, the agent can make confident tool choices. When your request is vague or ambiguous, the agent must guess at your intentions, and those guesses might lead to suboptimal tool selection. The clearer you are about what you want accomplished, the better the agent can choose appropriate tools.

Before you ask an agent to do something, it's worth checking that the agent has access to the tools needed for that work. If you want an agent to analyze a spreadsheet but haven't given it spreadsheet tools, the agent will struggle or fail. The Agent's Computer will show you the agent attempting the task with inadequate tools, helping you identify what's missing.


Understanding task files and work preservation

Every task an agent works on generates files, and understanding this file system helps you make full use of the agent's work. The task files aren't just the final outputs. They're a complete record of everything the agent created, analyzed, or processed during execution.

When you look at the task files in the Agent's Computer, you'll see a workspace organized by the task. If the agent created a presentation, you'll see all the slide files, the outline file, the metadata about the presentation, and any reference materials the agent consulted. Everything lives in one accessible location.

This preservation of intermediate work is particularly valuable when you want to iterate on results. Suppose the agent creates a presentation and one slide isn't quite right. You can look at the task files to see what information the agent used to create that slide. You can see the agent's outline to understand its reasoning. You can even see earlier drafts if the agent revised its work. This visibility helps you give the agent specific feedback for improvement.

The task files also enable collaboration between multiple agents or multiple runs of the same agent. If you have one agent that gathers research and another agent that writes reports, the first agent's task files can become input for the second agent. The work flows naturally from one agent to another through the file system.

Some agents create outputs meant for immediate presentation, like a finished document or a prepared email. Other agents create outputs that are stepping stones toward further work, like a data analysis that will inform a decision or a content outline that will guide writing. The task files accommodate both types of outputs equally well.

You can download anything from the task files. If the agent created exactly what you need, download it and use it immediately. If the agent created something close to what you need, download it and refine it yourself. The files are yours to use however makes sense for your work.


How agents adapt when things don't go as planned

Even with perfect planning and appropriate tool selection, agents sometimes encounter situations where the expected approach doesn't work. Watching how agents adapt to these situations reveals their true autonomy.

When an agent attempts an action and something goes wrong, you'll see this happen in real-time in the Agent's Computer. The agent tries to use a tool, the tool returns an error or unexpected result, and the agent must decide what to do next. This is where you see the agent's reasoning capability in action.

The agent first tries to understand what went wrong. If a file upload failed, is it because the file is too large? Because the target folder doesn't exist? Because the connection was interrupted? The agent examines the error message or unexpected result to diagnose the problem.

Based on that diagnosis, the agent adjusts its approach. If the folder doesn't exist, create the folder first and then try again. If the file is too large, try compressing it or splitting it into smaller pieces. If the connection was interrupted, wait a moment and retry. The agent doesn't give up at the first obstacle. It problem-solves.

You'll see this adaptation in both the chat and the Agent's Computer. In the chat, the agent might explain that the initial approach didn't work and it's trying an alternative. In the Agent's Computer, you'll see the agent create new subtasks or modify its approach within the current task. The task list itself can evolve as the agent learns more about what's needed.

However, adaptation has limits. An agent typically tries several different approaches before concluding that a task cannot be completed. If the agent attempts three or four different strategies and none succeed, it recognizes that it's stuck and reports the problem rather than continuing to fail repeatedly.

When an agent reports that it cannot complete a task, the explanation usually identifies what went wrong and why the agent couldn't work around it. This information helps you either fix the underlying problem or adjust your request to something the agent can accomplish with its available tools.


Reading agent progress and status

As you watch an agent work, several indicators help you understand how things are going and how much longer the work might take. Learning to read these signals helps you use agents effectively.

The task counter in the Agent's Computer shows you the most direct progress indicator. When you see two of twelve tasks completed, you know the agent is about one-sixth of the way through its work. When you see ten of twelve completed, you know the agent is nearly done. This counter updates in real-time as the agent completes each task.

The currently active task shows you what the agent is doing right now. If the task is labeled "Create slide about future outlook" and the agent has been working on it for several minutes, you know the agent is putting significant effort into that particular slide. Complex tasks naturally take longer than simple tasks.

Some agents display estimated time or progress bars for longer-running tasks. If the agent is processing a large file or generating complex content, you might see a progress indicator that helps you gauge how much of that specific task is complete.

The status indicators tell you the state of each task: waiting, in progress, completed, or failed. A failed task doesn't necessarily mean the agent gave up. The agent might have tried an approach that didn't work, marked that attempt as failed, and created a new task representing the alternative approach. Reading the task list as a whole tells you the story of how the work unfolded.

In the chat, the agent provides narrative updates about what it's doing and why. These updates complement the structured task information in the Agent's Computer. If you're ever confused about what the agent is trying to accomplish, the chat explanations clarify the agent's thinking.


What makes agent execution reliable

Understanding what makes agents reliable helps you create better tasks and choose appropriate work to delegate to agents. Several factors contribute to execution reliability.

  • Clear task descriptions help agents plan better and execute more reliably. When you tell an agent exactly what you want, including any constraints or preferences, the agent can build an appropriate plan. Vague requests lead to agents making assumptions that might not match your intentions.

  • Appropriate tool access ensures agents can actually do what you're asking. Before delegating work to an agent, verify that the agent has the tools needed for that work. An agent without the necessary tools will attempt the task anyway and likely fail, wasting time for both you and the agent.

  • Well-chosen language models improve decision quality throughout execution. If you're working on complex tasks that require sophisticated reasoning, use the more capable language models. For straightforward tasks, less capable models work fine and execute faster.

  • Focused tool sets help agents avoid confusion. Rather than giving every agent access to every possible tool, give each agent only the tools it needs for its specific purpose. This focus improves both execution speed and decision quality.

  • Reasonable task scope keeps work manageable. Agents handle discrete, well-defined tasks more reliably than enormous, open-ended assignments. If you have a very large piece of work, consider breaking it into multiple agent tasks rather than asking one agent to handle everything in a single execution.


Learning from watching agents work

One of the best ways to improve your use of agents is to watch them work and learn from what you observe. Each time you see an agent execute a task, you gain insight into how agents think and what they're capable of.

Pay attention to how agents break down your requests into task lists. Over time, you'll develop intuition for what constitutes a good unit of work for an agent. You'll learn to frame your requests in ways that lead to effective task decomposition.

Notice which tools agents choose for different types of work. This observation teaches you what each tool is good for and when it makes sense to use it. You'll start to understand the capability landscape of your available tools.

Watch how agents handle errors and obstacles. Seeing the agent adapt to unexpected situations shows you both what problems agents can solve themselves and what problems need your intervention. This understanding helps you calibrate how much autonomy to grant different types of agents.

Observe the time different tasks take. Some work that seems simple to humans takes agents quite a while, and some work that seems complex to humans gets done quickly. This timing intuition helps you set realistic expectations and plan when to use agents versus when to do work yourself.

Review the task files after agents complete work. Looking at what the agent created, what information it gathered, and how it organized its work teaches you about the agent's approach to problems. This knowledge helps you write better task descriptions and give more effective feedback.


What's next

Now that you understand how agents think and work, including the planning process, tool selection, task execution, and adaptation, you're ready to explore when and how to supervise agent work. The next article will cover supervised versus autonomous mode, helping you understand when to watch your agents work closely and when to let them run independently.

This foundation in agent thinking and execution will serve you well as you create more sophisticated agents. You can now predict how agents will approach tasks, anticipate what tools they'll need, and understand what you're seeing when you watch them work in the Agent's Computer. This understanding transforms agents from mysterious black boxes into transparent collaborators whose work you can follow, evaluate, and improve.

Did this answer your question?