[Feature]: Add Copilot Provider #518

Grant-Archibald-MS · 2025-01-08T17:27:10Z

Is your feature request related to a problem? Please describe.

Testing is often an after thought and not considered and integral part of the operations math to ensure successful operation of the deployed solution.

Given the existing functionality and foundations of Test Engine for Power Apps, provide integrated solution for generating tests for Copilot Studio Agents.

This feature should allow integration into a pipeline deployment model to validate the expected operation of the Copilot. The goal of the tests should enable different stakeholders to have an objective view of the health of the system and deploy new agents or update existing agents with confidence in the continued operation of the Copilot.

Describe the solution you'd like

Extend the Provider model Test Engine to provide the ability to execute tests against the Microsoft Copilot Studio agent.

Learning Path

Provide structured learning path with interactive Learning Modules to help me understand Power Fx so that makers can build test cases.

This could build on early example browser-based execution of Power Fx test steps in an extended learning module format with interactive learning path.

For example as part of the Power Apps Test Engine technical learning path there is a module on Asserting Results. This module demonstrates the ability to execute interactive test steps in Power Fx directly inside the learning module to help master concepts without the need to context switch to a different web page.

Record and Replay

Getting started can be hard to address this blocker this feature could look to extend existing Record and Replay allows the user to interact with Test CoPilot to automatically generate Power Fx specific to interacting with the agent in the Copilot studio.

This approach could look to extend examples like Recording Your First Test and enable easy Test Authoring

Test Authoring

To create a more scalable method of authoring test steps created via Record and Replay or test created from learning concepts adding a web based option provides a greater set of options to reduce components that need to be installed.

This could extend the current Visual Studio Code editor experience by offering an interactive web-based authoring experience as a deployed Power App.

The Application should allow editing and syntax validation of Test Cases and saving the test cases to a location that can be integrated with the Power Platform Deployment Pipeline

Deployment and Test Execution

Provide examples of how to extend Power Platform Pipeline to allow Test Cases to be executed and included in Deployment Results. The process should complete with a gated approval process to review changes and test results to allow deployment of the Copilot.

Test Case Features

The provider should be able to:

Ability to login using user credentials
Test published and unpublished Copilot
Monitor API calls to create Power Fx variables that can be Asserted using simple low code. This could include:
- Messages Sent / Received
- Attachments
- Knowledge sources used
- Steps triggered
- AI Response (Text Summary, Speech Summary, Text Citations (Collection - Title, Type, Position),
New Power Fx functions to interact with Copilot using Test Engine Power Fx extensions specific to Copilot concepts. For example:

Send("Some user text");
AttachFile("file name");
WaitUntilResponseComplete();
Assert(CountRows(Filter(Steps, Name = "UniversalSearchTool"))>0,"Universal search tool triggered");
Assert(CountRows(Filter(KnowledgeSources, "docx" in Name)) > 0, "Microsoft Word Data Sources Found"); 
Set(Greeting, First(Response));
AIBuilderMatch(Greeting.Text,"The greeting should be fun and engaging and provide helpful tips to get started");
Select(Last(Response).Action1); // Select Action 1 button from Adaptive Card

Describe alternatives you've considered

The Power CAT Copilot Studio Kit includes functionality to create and run test cases. Including:

Define Test Sets and Test Cases
Support Different Test Types
- Response Match: Test Utterance, Expected Response
- Topic Match: Test Utterance, Expected Topic Match, Expected Response
- Attachments (Adaptive Cards): Test Utterance, Expected Json, Expected Response
- Generative Answer: Test Utterance, Expected Generative Answer Outcome
Enrich tests with Application Insights Queries
Enrich tests Conversation Transcripts (Activity, Channels, Ambiguous Utterances, Attachment, Intent (Candidates, Scores), Session Details, Citations, Results...)
Analyze Generated answers by running AI Builder Analysis
Expected Values (Attachments, Responses, Topics...)
Interact with Direct Line API

Additional context?

The simplicity of the of the Power CAT Copilot Tookit Test Sets and Cases provides a great method to handle Test Execution. Key discussion points:

Path to first party execution of Copilot tests as part of pipeline deployments?
Can the role of record and reply mitigate the getting started problem?
By using interactive learning modules directly in the browser and simulated Copilot conversations does it make it easier to demonstrate how testing can be done early in the process?
By leveraging the rich feature set of Power Fx that already exists today does it make the process of creating E2E tests much easier
What specific extensions would be needed to effectively support Copilot test scenarios?
Overlap with the features of the Copilot Toolkit testing and how handle complimentary nature of test cases

The text was updated successfully, but these errors were encountered:

Grant-Archibald-MS · 2025-01-09T01:43:47Z

Combining this with interactive simulation of the and test data setup could also help in the authoring process

Simulation	Editor
Message from chat	WaitForMessage();
>>>>> Response from user	Send("Response from User");
From follow text	AICheck(First(Message).Text,"The tone of the message should be happy and upbeat..."

Grant-Archibald-MS · 2025-01-09T01:44:01Z

Combining this with interactive simulation of the and test data setup could also help in the authoring process

Simulation	Editor
Message from chat	WaitForMessage();
>>>>> Response from user	Send("Response from User");
From follow text	AICheck(First(Message).Text,"The tone of the message should be happy and upbeat..."

Grant-Archibald-MS · 2025-01-21T03:23:08Z

To aid in the design of the Copilot test steps for the provider started on [Feature]: Getting Started Guide #513

Converted the content from PowerfulDev Testing site to docs as part code and published as Power Apps Test Engine GitHub Pages Hosted Documentation.

Aim to first look at different Power Fx language functions using the Power Fx Playground with sample Power Fx that could be implement to cover different testing scenarios including:

Waiting for a message
Sending text input from the user
Simulating conditional response based on keyword match
Simulation of Messages
Validating an Adaptive Card
Validating Knowledge Sources
Validating matched Intents
Meaning Match using an AI Prompt

Grant-Archibald-MS · 2025-01-21T03:57:40Z

Create first sample test case for Microsoft Copilot Studio using Experimental namespace provide for the Microsoft Copilot Studio test component for a created agent.

The sample will

Login as a user account to the configure environment and agent
Use the test panel of the Portal
Allow a message to be sent
Wait for a message to be received that matches an expected response text.

The initial sample Power Fx

Experimental.WaitUntilMessage("Safe Travels Agent");
  Assert(CountRows(Copilot.Messages) > 1);
  Experimental.SendText("Where can I fly today?");
  // Wait for response
  Experimental.WaitUntilMessage("Is there anything else I can help you with");

This example waits for the greeting
Send message
Waits for the final message

Next Steps to consider:

Playground Examples. Create Learning Playground samples to discuss the approach and concepts
Extensions the WaitForMessage. For example:
- Literal for text match
- Fuzzy Wait that evaluates responses and looks for high probability match
- AI Assisted wait that evaluates prompt to determine is match found
- Meaning Match evaluates responses and expected response and determines match of meaning
Global Asserts. How test for common patterns across all responses:
- Ability to handle questions it does not know about
- Tone of the message and how well it aligns withe the "personality" of the agent
How provide a range of different inputs / synonyms and ensure matching Intent

Grant-Archibald-MS · 2025-01-30T01:00:23Z

Extend the copilot provider to also support DirectLine as an alternative provider to the Copilot . Considering the following:

Method of obtaining the direct line secret
Ability to make use of Direct Line SDK
The differences in enrichment of other context data as the DirectLine API just returns the conversation and not the Agent state. For example:

Knowlege Sources used
Actions used
Variables

What the enrichment services would be required for data stored in Dataverse, Application Insights
Validation of responses using AI Builder
Testable interface using XUnit tests
Testability of DirectLine components with interface similar to:

public interface IDirectLineClientWrapper
{
    Task<TokenResponse> GenerateTokenForNewConversationAsync();
    Task<Conversation> StartConversationAsync();
    Task<ResourceResponse> PostActivityAsync(string conversationId, Activity activity);
    Task<ActivitySet> GetActivitiesAsync(string conversationId);
}

Grant-Archibald-MS added the enhancement New feature or request label Jan 8, 2025

Grant-Archibald-MS self-assigned this Jan 8, 2025

Grant-Archibald-MS added this to Power Apps Test Engine Jan 8, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature]: Add Copilot Provider #518

[Feature]: Add Copilot Provider #518

Grant-Archibald-MS commented Jan 8, 2025 •

edited

Loading

Grant-Archibald-MS commented Jan 9, 2025

Grant-Archibald-MS commented Jan 9, 2025

Grant-Archibald-MS commented Jan 21, 2025 •

edited

Loading

Grant-Archibald-MS commented Jan 21, 2025

Grant-Archibald-MS commented Jan 30, 2025 •

edited

Loading

[Feature]: Add Copilot Provider #518

[Feature]: Add Copilot Provider #518

Comments

Grant-Archibald-MS commented Jan 8, 2025 • edited Loading

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Learning Path

Record and Replay

Test Authoring

Deployment and Test Execution

Test Case Features

Describe alternatives you've considered

Additional context?

Grant-Archibald-MS commented Jan 9, 2025

Grant-Archibald-MS commented Jan 9, 2025

Grant-Archibald-MS commented Jan 21, 2025 • edited Loading

Grant-Archibald-MS commented Jan 21, 2025

Grant-Archibald-MS commented Jan 30, 2025 • edited Loading

Grant-Archibald-MS commented Jan 8, 2025 •

edited

Loading

Grant-Archibald-MS commented Jan 21, 2025 •

edited

Loading

Grant-Archibald-MS commented Jan 30, 2025 •

edited

Loading