Copilot Studio and data retrieval from Dataverse

Posted on: 2025-07-31

I often describe the Power Platform as a giant LEGO building set. As Copilot Studio has evolved, the number of building blocks has only increased tremendously. So much so that even the weakest of us start to feel dizzy.

This time we’ll explore the different capabilities of Copilot Studio, or rather those LEGO bricks. Using these bricks, you can accomplish (roughly) the same thing in a bewildering number of different ways.

Example – An agent who knows about cars

Let’s build a simple agent that can query basic information about cars and their owners. The information is located in two tables in Dataverse

Car owners (Contact, Dataverse standard table)
Cars

There are a over 5000 cars in the database. Some have the brand listed, but most do not. Each car’s information includes the owner of the car (lookup to the contact table), as well as the mileages.

We want to build an agent that can answer basic questions about the cars we have stored. In style

How many cars are there?
How many Toyotas are there?
How many cars does Timo Pertilä own?

What different implementation options do we have to build an agent that leverages the data lying in the Dataverse?

Option 1 – Knowledge

Let’s start with the most obvious one. Let’s add Dataverse as a knowledge source for the agent.

Let’s pick the necessary tables. This time the cars table will be enough.

Ready! Let’s test the agent with a few simple questions. The total number of cars will come out immediately.

Somewhat surprisingly, the agent is also able to answer questions about car owners, even though we didn’t add a contact table with the owners’ information (such as first and last name) as a knowledge source.

The agent answers various questions nicely.

However, it cannot determine how many different car brands are in the database. This may be because I have lazily stored the car brand in the name column.

Let’s continue our journey.

Option 2 – Using a knowledge source in a topic

Let’s create a topic for the agent to use . First, we describe to the agent when this topic will be used (1, whenever the user asks about cars or their owners). Next, we add a generative response as the action and define the data source it uses as the knowledge source we added earlier (2). The input is the question entered by the user (variable Activity.Text, 3)

Now when we ask the agent about cars, it uses this topic to answer. The answer is correct.

The agent also continues to work when asked about car owners.

Why would we want to leverage a knowledge source within a topic? Using a topic gives us a significant amount of control. We can add our own parameters to the question in addition to the user’s question. We can query multiple places and form an answer based on the answers we get. Using a topic opens up all sorts of possibilities.

If we use only a knowledge, we have very little control.

Option 3 – Connector

Next, let’s try an agent with the Dataverse connector added as a tool.

Let’s move on to configuring the connector (Add and configure).

Then we write a better description for the connector so the agent knows how to use it.

We define the connector to retrieve data from the cars table (you can also let it decide which table to retrieve data from at any time). Finally, the connector generates the response itself (Write the response with generative AI).

When testing the agent, we immediately encounter a problem. The response returned by Dataverse is so large that our agent cannot handle it. There were just over 5,000 cars.

Dataverse’s List rows function is not suitable for use if there are a lot of rows. It can be used with a small number of rows. Below, it is used to search for customers starting with the letter t in my mini CRM.

The connector is better suited for retrieving limited information that supports a decision, for example machine models available to a new employee, than for free querying extensive data like the example.

Option 4 – Agent flow

What if we added a flow to our agent that retrieves all cars and their information from Dataverse?

Let’s create a flow.

Flow lists all cars (List rows action). Only the columns of interest are extracted from the result set (Select action) and returned to the agent. If the final result were this time of a suitable size for the agent.

Let’s give the flow a better description for the agent.

After the Flow is executed, a response is generated using generative artificial intelligence.

We run into the same problem as with the connector: the agent is unable to process the long response it receives.

I tried limiting the number of rows returned by the flow. The agent can handle 1000 rows. 1500 is already too many.

Of course, we expected this outcome. It’s basically the same procedure as when using a connector.

You can also use the connector and flow within a topic, but that doesn’t solve this row count problem, so let’s move on.

Option 5 – Custom prompt

What if we asked for information about cars using our own prompt?

Let’s create one.

The prompt receives the user’s question as a parameter and tries to find the answer based on the content of the cars table and the related contact information table.

Let’s try it. The response comes, but the prompt has only used 30 lines from Dataverse in its response.

The reason is obvious. There is a setting in the prompt that defines how many rows are retrieved from Dataverse. The default value is 30 (the maximum is 1000).

This is understandable. The length of the text fed to the language model is directly proportional to the costs it generates. Otherwise, there is no point in using a custom prompt for this type of search.

You can also use your own prompts in the agent within topics , within a flow, or, for example, within a flow started from a topic.

Option 6 – Dataverse MCP server

What if we added a Dataverse MCP server to our agent? Based on recent hype, it will solve all our problems.

Super easy to add!

You can see what all the MCP server can do in the tools section.

Let’s test it. And that’s where the correct answer comes from. It will be interesting to see what kind of SQL query the question eventually becomes.

Car owner information is also available.

We hit a wall with the third question. The correct SQL query is generated, but Copilot Studio reports that I have used generative orchestration enough for now.

After waiting a few minutes, I can continue testing my agent. In practice, I ran into these limits. Specifically, my development environment’s 10 requests per minute limit.

Our agent uses the MCP server to create SQL queries to Dataverse. The queries cover the entire table, but the MCP server always returns a maximum of 20 rows as a result set.

The agent is asked to list all cars. The end result is 20 rows out of 5006.

Summary

As we saw, there are several ways to read Dataverse data from Copilot Studio. However, there are significant differences in how they work. The knowledge source utilizes Dataverse search. Other methods retrieve records directly from the Dataverse table(s). This immediately introduces limitations, as we do not want (the platform does not want) to return thousands of rows of raw data for the language model to digest, as processing them is expensive. This quickly leads to the 1000 row limit.

Using the Dataverse MCP server is different. The agent does this by using SQL queries to Dataverse. It may make several attempts until it succeeds. For example, the agent can first ask for the names of the columns in the table and based on this, determine what kind of query to run this time.

When using the MCP server, queries cover all records, but the return value is always a maximum of 20 rows.

The most unclear thing about using MCP servers is the cost. But I don’t think that will be resolved soon.

Copilot Studio and data retrieval from Dataverse

Example – An agent who knows about cars

Option 1 – Knowledge

Option 2 – Using a knowledge source in a topic

Option 3 – Connector

Option 4 – Agent flow

Option 5 – Custom prompt

Option 6 – Dataverse MCP server

Summary

Related insights

Leave a Reply Cancel reply