-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable CEDARScript edit format #1961
base: main
Are you sure you want to change the base?
Conversation
I skimmed through your code and noticed:
|
I'll re-check those items and update the PR. Thank you! |
ba5c50a
to
42cffd7
Compare
requirements.txt
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file still needs to get regenerated to remove rope
dependency and what it pulls in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did regenerate it. rope
is now coming from this:
rope==1.13.0
# via cedarscript-editor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: CEDARScript now uses Tree-Sitter instead of Rope to obtain the CST.
cache_prompts=True, | ||
suggest_shell_commands=False, | ||
) | ||
with change_dir(testdir): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we don't do this, the CEDARScript editor won't know which directory to consider as the base. We can remove all changes to this file, but then we won't be able to run benchmarks using cedarscript-g
edit format.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should I drop changes to this file, and create a separate PR? Maybe it's better. This main PR won't be able to run benchmarks using cedarscript-g
though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is fine, since
- the benchmark does not touch regular usage
- it is more or less part of the new coder
but @paul-gauthier should make the call of course.
It's great that you are trying out new code editing prompts. It's a fun area to experiment. But it seems premature to be proposing a PR for this, for a couple of reasons:
|
Agreed. Will try to get data for more models and the main code editing benchmark too. |
8ab7a06
to
4aa8890
Compare
Now, just like Aider itself, I'm also using tree-sitter-languages, and I have setup tree-sitter queries for Python, Kotlin, PHP, Rust, Go, C++, C, Java, Javascript, Lua, FORTRAN, Scala and C#,
Got new refactoring benchmark results for Gemini 1.5 PRO & Flash. However, for the main editing benchmark, Gemini 1.5 Flash shows poor performance: pass_rate_4: 39.1
percent_cases_well_formed: 69.9 Gemini 1.5 PRO: pass_rate_1: 42.1
percent_cases_well_formed: 85.7 I still need to improve the prompts to get it right. |
6d03727
to
00c60cd
Compare
We now have benchmark results for: Editing Benchmarks:
Refactoring Benchmarks:
While the main editing benchmark results show degraded performance, the fact that the refactoring benchmark shows significantly better results means there's something worth investigating here. Of course, it's still possible to test it without merging. I'll try to provide detailed instructions for those who want to try out CEDARScript without the merge. |
I was curious to try it, but ran into some issues: EDIT: commit 9e2a73d was able to run installed via:
run command:
|
I think the options have changed - try using Thanks for posting your install process by the way - unfortunately there is an issue for me on Manjaro Linux, not related to the code I think. |
I'm working on a fix, guys! |
Note: you don't have to install this: pip install \
git+https://github.com/CEDARScript/cedarscript-integration-aider Since this PR already depends on it |
Fixed. Testing on an empty venv succeeded: pip install --upgrade --force-reinstall \
git+https://github.com/elifarley/aider@cedarscript \
aider-chat
Aider v0.61.1.dev462+gab333361 Please check the updated installation instructions. |
ab33336
to
ae93ecd
Compare
5806faf
to
8cb0484
Compare
# Conflicts: # requirements.txt
8c1d70b
to
ddd0b0f
Compare
Intro
This PR adds
cedarscript
as an optional edit format, best suited for handling refactorings of larger files.When used by the editor LLM in the architect/editor mode, it should work great with any file size.
How to install Aider+CEDARScript from this PR
Further details:
Refactoring Benchmarks
Gemini 1.5 Flash
Notice that Sonnet 3.5 is still a tad better at
pass_rate,
but loses to Gemini Flash inpercent_cases_well_formed
:Sonnet 3.5:
More...
Gemini 1.5 Flash:
Comparison: Gemini 1.5 Flash using whole vs CEDARScript
Performance Highlights:
Gemini 1.5 PRO
More...
Comparison: Gemini 1.5 PRO using diff-fenced vs CEDARScript
Performance Highlights:
Success Distribution:
OpenAI GTP-4o
More...
Comparison: Gemini 1.5 PRO vs GPT-4o (both using CEDARScript)
DeepSeek
More...
Comparison: Gemini 1.5 PRO vs DeepSeek
Editing Benchmarks
Gemini 1.5 PRO
More...
Comparison: Gemini 1.5 PRO using whole vs CEDARScript
Gemini 1.5 Flash
More...
Comparison: Gemini 1.5 Flash using whole vs CEDARScript
Haiku 3.5
More...
Comparison: Gemini 1.5 PRO using whole vs Haiku 3.5 using CEDARScript