Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Workspace agent functions and prompt #14426

Merged
merged 4 commits into from
Nov 12, 2024
Merged

Improve Workspace agent functions and prompt #14426

merged 4 commits into from
Nov 12, 2024

Conversation

JonasHelming
Copy link
Contributor

related to #14361

What it does

  • Add new function GET_WORKSPACE_DIRECTORY_STRUCTURE_FUNCTION
  • Refactored GET_WORKSPACE_FILE_LIST_FUNCTION to only return the files in one directory
  • Improved function error handling
  • Improved Workspace prompt

How to test

Play with the workspace agent, e.g. let it search for a file or for an information.
Or let it create code based on a template, e.g.

"I want to integrate Anthropic as a LLM provider in the Theia IDE. There is already an integration of OpenAI that you can use as a template. It can be found in my workspace under ai-openai. there is another example for huggingface under ai-hugging-face Look at all files in these two examples and then generate all necessary files for me to add a new package "ai-anthropic" that uses the anthropic Typescript API (@anthropic-ai/sdk). Add "claude-3-5-sonnet-20241022" as an example model"

Review checklist

Reminder for reviewers

fixed #14361

Signed-off-by: Jonas Helming <jhelming@eclipsesource.com>
@JonasHelming JonasHelming requested a review from planger November 9, 2024 20:07
Signed-off-by: Jonas Helming <jhelming@eclipsesource.com>
Copy link
Contributor

@planger planger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for improving the workspace agent! If this agent is optimized, there is huge potential for being super useful!

I played around with your suggestion, but found a few cases, where it actually performs worse because of directing the response to just directories and omitting important files on the root level. So I'm not sure if this fits all use cases, unless we guide the LLM more in the system prompt to consider the root contents with a higher priority.

Before

It read the README or the package.json to give a concise and correct answer:

image

After

It actually looked into src/index.ts and guessed that this maybe is built with npm. It was right here, but just by accident (this may well have been a yarn or pnpm project).

image

Why?

As can be seen, it will never "see" the readme or the package.json which both are crucial to answer important questions with the new approach, while it directly went to the readme or the package.json in the previous approach.

import { FILE_CONTENT_FUNCTION_ID, GET_WORKSPACE_DIRECTORY_STRUCTURE_FUNCTION_ID, GET_WORKSPACE_FILE_LIST_FUNCTION_ID } from '../common/functions';

function shouldExclude(stat: FileStat): boolean {
const excludedFolders = ['node_modules', 'lib'];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this was there before, but it'd be better to at least put the function as a method of the tool, so adopters can at least customize this hard-coded list of folders to be excluded.
Ideally it should even be an injectable service, maybe with a default implementation that either provides a commonly useful list or looks into the gitignore.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe two settings:
consider .gitignore: boolean
ignore directories: String[]
plus an injectable service?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that would be great plus extra points :-)
To me it'd be important that platform adopters can easily customize the filter list via injection and that there are reasonable defaults for Theia IDE. Configuration options for the Theia IDE user is then extra nice on top.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, did the services and added user settings as a follow-up (#14119)

throw new Error('Workspace root not found');
}

const workspaceRootUri = wsRoots[0].resource;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With just looking into the first workspace root, we don't really support multi-root workspaces. I think it'd be good to try supporting multi-root workspaces here. We could try to just make this explicit to the LLM:

{
  "<path-of-workspace-root[0]>": [ <file1>, ... ],
  "<path-of-workspace-root[1]>": [ <file1>, ... ],
}

And then in the FileContentFunction we'd need the as a parameter or request absolute paths again? Maybe that would work?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer making "mutiple workspaces support" a follow-up, added it here: #14119

packages/ai-workspace-agent/src/browser/functions.ts Outdated Show resolved Hide resolved
packages/ai-workspace-agent/src/browser/functions.ts Outdated Show resolved Hide resolved
packages/ai-workspace-agent/src/browser/functions.ts Outdated Show resolved Hide resolved
@JonasHelming
Copy link
Contributor Author

Changes:

  • Adapt prompt to make "root" files more important
  • Refactored code to remove duplication
  • Make WorkspaceUtils a service including "shouldIgnore"
  • Pass "Workspace not found" error to the LLM

Signed-off-by: Jonas Helming <jhelming@eclipsesource.com>
Copy link
Contributor

@planger planger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thanks for the revision! Looks good to me and the previous use cases now work much better again.

Just one optional minor:
I'd rename WorkspaceUtils into something more specific to the workspace agent, e.g. WorkspaceAgentScope or something like that. Since this is a global key, a generic name may more likely clash or confuse at some point.

Signed-off-by: Jonas Helming <jhelming@eclipsesource.com>
@JonasHelming
Copy link
Contributor Author

@planger Great point, I renamed it to WorkspaceFunctionScope

@JonasHelming JonasHelming merged commit 1900a80 into master Nov 12, 2024
11 checks passed
@github-actions github-actions bot added this to the 1.56.0 milestone Nov 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

2 participants