Skip to content

Copilot Studio and computer use

A few weeks ago, Microsoft released a new tool for Copilot Studio for everyone to try out. The new computer use tool allows agents to use websites independently.

But why on earth should an agent be allowed to use websites?

Because often there is a part of the process that an agent is working on that does not have a Power Platform connector, API, MCP server, or any modern way to complete it. Now the agent’s work does not have to stop there. With this new feature, the agent can navigate to the required web page and independently complete the task using this web page. In the future, the agent can also use Windows applications in the same way.

Agent + computer use vs RPA (Robotic Process Automation)

A computer program that performs predefined tasks using a browser / windows application. Isn’t this RPA? This has been done for decades. What’s so revolutionary about this?

There is a significant difference between RPA and an agent using computer.

  • In RPA, a software robot executes predefined scripts. Every button press, cursor movement in a text field, etc. required to complete a task is included in the script. If the user interface used by RPA changes, these scripts always have to be updated. If a new unforeseen error occurs during execution, the software robot is unlikely to recover from it.
  • The agent’s computer usage is based on artificial intelligence. It is given a task (book x movie tickets for two at the Helsinki for an action movie on Thursday evening from the website) and the agent decides for itself how to complete this task using the computer. Of course, the agent’s operation can be made more efficient by telling it how the given task is easiest to complete. If the user interface used changes, it will not affect the agent’s operation in any way.

The agent’s computer use is sometimes painfully slow. And it’s not always (with poor instructions) able to complete the task. But the promise is so revolutionary (vs. traditional RPA) that this card is worth looking at.

Example – Agent placing orders from verkkokauppa.com

Let’s try this new tool in practice. Imagine we have a workflow that processes IT equipment orders from staff. Once approved, computers and phones follow their own automated paths to completion. However, smaller IT items still fall outside this process, so we need to figure out how to handle those.

After approval, they could be given to the agent to order. Let’s skip the initial stage of the process and focus on how to get the agent to order goods for us from the online store. Let’s create a new agent for that.

Add a tool to the agent.

There are more options behind the New tool button.

Select Computer use as the tool .

Next, the tool is given instructions on what to do. After that we proceed to the next step (Add and configure).

Name the tool and give it a description. The description is important because it helps the agent recognize when to use this tool.

You can also

  • Store passwords used by the tool (e.g. logging in to a service). In practice, a link is created to an Azure Key Vault record.
  • Limit which sites the tool can go to. With hallucinogenic agents, this can be something you want to do.
  • Specify whether to use the agent’s maker or end-user provided credentials.

It is easiest to use the hosted browser provided by the platform. With it, the agent can use only a web browser.

Let’s add another parameter to the tool (items to purchase). The tool is always told which products you want it to purchase.

You might have noticed that we cheated a bit. We don’t ask the agent to buy the requested products. We only ask it to add them to the cart. I don’t want to generate a huge amount of expenses for myself because of this example.

This is a problem even in real implementations. How do we test that the final ordering is successful? Without actually ordering anything.

Does it work?

Save the computer use tool and test it.

Let’s ask the agent to order the following items for us

  • two basic mice
  • 2m ethernet cable

It took exactly 4 minutes, but (this time) the end result is exactly what was wanted.

The attached video shows the agent’s work from start to finish, at a much faster speed. On the left side, the agent explains what it is doing/trying to do.

It’s fascinating to watch the agent’s progress. Sometimes it starts searching for products, sometimes it navigates through product groups to the desired product. Sometimes it enters a new search in the search field after the previous search text. For some reason, it wants to open the online store’s mega menu all the time. For example, when its trying to add a product to the shopping cart.

Most of the time, it notices these errors and can correct its behavior. If task can’t be done in one way, the agent tries something else. Instructions can have a big impact on the agent’s behavior. Even small differences in wording are significant. It’s worth testing with several different input data and seeing where the agent starts to fumble. And then try to correct it by giving instructions.

Summary

Using a computer opens up completely new possibilities for agents. But using it is not yet problem-free.

  • Language models don’t work identically every time they run. Would you dare give an agent your credit card and let them make purchases for you? I asked the example agent to buy an iPhone and an ethernet cable. The end result was 4 iPhones and 13 (wrong) ethernet cables in the shopping cart.
  • At least not yet, the agent cannot interrupt the computer to ask the user for confirmation. For example, “I now have these products in my shopping cart. Do you want to continue?”
  • Selecting a day with the datepicker control still seems to be really difficult.
  • If the web service first wants to verify whether the user is human, the agent will politely stop there.

The feature is in preview and has been given very limited resources. This is a familiar sight.

Despite its shortcomings, this is a really interesting new feature.

Computer useCopilotCopilot StudioCUARobotic Process AutomationRPA

Leave a Reply

Your email address will not be published. Required fields are marked *

Forward Forever logo
Cookie settings

This website uses cookies so that we can provide you with the best possible user experience. Please select the cookies you want to allow.