-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve Workspace agent functions and prompt #14426
Conversation
fixed #14361 Signed-off-by: Jonas Helming <jhelming@eclipsesource.com>
Signed-off-by: Jonas Helming <jhelming@eclipsesource.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for improving the workspace agent! If this agent is optimized, there is huge potential for being super useful!
I played around with your suggestion, but found a few cases, where it actually performs worse because of directing the response to just directories and omitting important files on the root level. So I'm not sure if this fits all use cases, unless we guide the LLM more in the system prompt to consider the root contents with a higher priority.
Before
It read the README or the package.json to give a concise and correct answer:
After
It actually looked into src/index.ts
and guessed that this maybe is built with npm
. It was right here, but just by accident (this may well have been a yarn or pnpm project).
Why?
As can be seen, it will never "see" the readme or the package.json which both are crucial to answer important questions with the new approach, while it directly went to the readme or the package.json in the previous approach.
import { FILE_CONTENT_FUNCTION_ID, GET_WORKSPACE_DIRECTORY_STRUCTURE_FUNCTION_ID, GET_WORKSPACE_FILE_LIST_FUNCTION_ID } from '../common/functions'; | ||
|
||
function shouldExclude(stat: FileStat): boolean { | ||
const excludedFolders = ['node_modules', 'lib']; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this was there before, but it'd be better to at least put the function as a method of the tool, so adopters can at least customize this hard-coded list of folders to be excluded.
Ideally it should even be an injectable service, maybe with a default implementation that either provides a commonly useful list or looks into the gitignore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe two settings:
consider .gitignore: boolean
ignore directories: String[]
plus an injectable service?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that would be great plus extra points :-)
To me it'd be important that platform adopters can easily customize the filter list via injection and that there are reasonable defaults for Theia IDE. Configuration options for the Theia IDE user is then extra nice on top.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, did the services and added user settings as a follow-up (#14119)
throw new Error('Workspace root not found'); | ||
} | ||
|
||
const workspaceRootUri = wsRoots[0].resource; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With just looking into the first workspace root, we don't really support multi-root workspaces. I think it'd be good to try supporting multi-root workspaces here. We could try to just make this explicit to the LLM:
{
"<path-of-workspace-root[0]>": [ <file1>, ... ],
"<path-of-workspace-root[1]>": [ <file1>, ... ],
}
And then in the FileContentFunction
we'd need the as a parameter or request absolute paths again? Maybe that would work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer making "mutiple workspaces support" a follow-up, added it here: #14119
Changes:
|
Signed-off-by: Jonas Helming <jhelming@eclipsesource.com>
d8ca0f2
to
589e416
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, thanks for the revision! Looks good to me and the previous use cases now work much better again.
Just one optional minor:
I'd rename WorkspaceUtils
into something more specific to the workspace agent, e.g. WorkspaceAgentScope
or something like that. Since this is a global key, a generic name may more likely clash or confuse at some point.
Signed-off-by: Jonas Helming <jhelming@eclipsesource.com>
@planger Great point, I renamed it to WorkspaceFunctionScope |
related to #14361
What it does
How to test
Play with the workspace agent, e.g. let it search for a file or for an information.
Or let it create code based on a template, e.g.
"I want to integrate Anthropic as a LLM provider in the Theia IDE. There is already an integration of OpenAI that you can use as a template. It can be found in my workspace under ai-openai. there is another example for huggingface under ai-hugging-face Look at all files in these two examples and then generate all necessary files for me to add a new package "ai-anthropic" that uses the anthropic Typescript API (@anthropic-ai/sdk). Add "claude-3-5-sonnet-20241022" as an example model"
Review checklist
Reminder for reviewers