-
Notifications
You must be signed in to change notification settings - Fork 291
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
adding autoedits support #5845
adding autoedits support #5845
Conversation
vscode/src/configuration.test.ts
Outdated
experimentalSupercompletions: false, | ||
experimentalAutoedits: false, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you plan to use the existing supercompletions
code in your work? If not, it would be great to remove it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can keep supercompletions for now, since some code might be usable, also this PR is just to iterate quickly on the model quality checks. The actual integration would anyways be with the existing autocomplete
code like you suggested, so in future we might remove both these functionalities.
const response = await axios.post( | ||
'https://api.openai.com/v1/chat/completions', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we do it through the existing infrastructure using the Sourcegraph instance and Cody Gateway? We should have all the tools required to make these requests without introducing a separate HTTP client and API key management on the client.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the goal of this PR is to quickly check the model performance and will mostly live internally behind a feature flag. For any model we finetune and want to experiment with, would need to be added in the cody gateway allow list for client to actually work, and this would slow the quick iterations.
When we integrate the autoedits
we should do it properly using Cody gateway
etc. but I think for now, it would make sense to keep it as is, and spend time instead on model improvements and making the feature work with the smaller models :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can use the server-side model configuration for quick iterations. I just added the ft:gpt-4o-mini-2024-07-18:sourcegraph-production::AFXNjNiC
model to the sg04 instance. Usage example:
curl 'https://sg04.sourcegraphcloud.com/.api/completions/stream?api-version=4&client-name=defaultclient&client-version=6.0.0' -H 'Content-Type: application/json' -H "Authorization: token $SRC_ACCESS_TOKEN" --data-raw '{
"maxTokensToSample": 4000,
"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": "what is in this file?"
},
{
"type": "file",
"file": {
"uri": "https://github.com/sourcegraph/cody/-/blob/src/index.ts", "content": "function helloworld() { console.log(`hello world`)"
}
}
]
}
],
"model": "openai::2024-02-01::gpt-4o-mini-autoedit",
"stream": false
}'
This should allow us to migrate to the existing infra and avoid adding additional tech debt to the codebase.
@@ -284,7 +284,7 @@ describe('ChatController', () => { | |||
}) | |||
}) | |||
|
|||
test('send error', async () => { | |||
test.only('send error', async () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
leftover
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(will continue the review tomorrow)
// ################################################################################################################ | ||
|
||
// Some common prompt instructions | ||
const SYSTEM_PROMPT = ps`You are an intelligent programmer named CodyBot. You are an expert at coding. Your goal is to help your colleague finish a code change.` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we update it to match our preamble so that it'd work with Cody gateway later?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes sense, can you please add a link to the preamble we use. I will try that offline :)
${lintErrorsPrompt} | ||
${recentCopyPrompt} | ||
${codeToRewritePrompt} | ||
${FINAL_USER_PROMPT} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Super cool!!!
From the way that llm focus works wouldn't we want to keep the CURRENT_FILE_INSTRUCTION at the very end or at the start or is that irrelevant coz of relatively smaller context here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes sense Ara, I think this is something which worked offline but I am yet to do a proper evaluation of the prompt rendering. Also, since this is just experimental feature for now, we can keep it as is, based on the offline experiment and results, I can update the ordering :)
Thanks @valerybugakov @abeatrix @arafatkatze for the review |
bd36253
to
9a01733
Compare
Hi @valerybugakov |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey Hitesh! I reviewed only part of the updated PR today. I will complete it on Monday morning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work! Excited for v0 🎉
The only significant comment is related to using the existing pipeline for network requests. It should simplify the code, make it easier for teammates to dogfood this feature (no manual API key management), and ensure we don't spend time ripping it off later.
private strategyFactory: ContextStrategyFactory, | ||
dataCollectionEnabled = false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's convert it to an object to have new ContextMixer({ strategyFactory, dataCollectionEnabled: false })
instead of new ContextMixer(strategyFactory, false)
for readability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
function convertTokensToChars(tokens: number) { | ||
return tokens * 4 | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use tokensToChars
from @sourcegraph/cody-shared
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
|
||
getModelResponse(model: string, apiKey: string, prompt: PromptProviderResponse): Promise<string> | ||
} | ||
export class OpenAIPromptProvider implements PromptProvider { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you extract OpenAIPromptProvider
and DeepSeekPromptProvider
into separate files?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
return new RecentEditsRetriever(10 * 60 * 1000) | ||
return new RecentEditsRetriever({ | ||
maxAgeMs: 10 * 60 * 1000, | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💜
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just tested the latest changes and noticed there is some formatting issue where the code is inserted at the unexpected position:
It works as expected if triggered in-line though so this might be some edge cases that can be looked into in follow-ups:
Not a blocker: it might be helpful if we could limit the change block to a smaller section since sometimes it would start at the function-level when the changes are only needed for a variable.
Since the feature is behind a feature flag and works as expected in most cases, approving to unblock after comments from @valerybugakov are addressed 😄
this.recentEditsRetriever = new RecentEditsRetriever(EDIT_HISTORY, workspace) | ||
this.recentEditsRetriever = new RecentEditsRetriever( | ||
{ | ||
maxAgeMs: EDIT_HISTORY, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maxAgeMs: EDIT_HISTORY, | |
maxAgeMs: EDIT_HISTORY_TIMEOUT, |
nit: clear naming that matches the SUPERCOMPLETION_TIMEOUT
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
} | ||
|
||
private logDiff(uri: vscode.Uri, codeToRewrite: string, predictedText: string, prediction: string) { | ||
const predictedCodeXML = `<code>\n${predictedText}\n</code>` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure if this would be an issue but previously when we use <code>
as the XML tags the LLM sometimes would misunderstood it as part of the code if the code is HTML.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is to just log on the console, we are not pushing this to LLM in the prompt.
But this is a really nice insight, thanks for that :)
Thanks for the review @abeatrix @valerybugakov |
Context
The PR enables a basic setup for
auto-edits
. The goal of the PR is for us to check the model quality by manually triggering theauto-edits
and have a vibe check. The actual integration of next edits should be done with the existing autocomplete infra. So, this PR is supposed to live behind a feature flag just for the internal testing and model iterations.ft:gpt-4o-mini-2024-07-18:sourcegraph-production::AFXNjNiC
Steps to run the autoedits (in debug mode):
ctrl+shift+enter
to trigger theCody by Sourcegraph
will show the diff in the console.ghost text
and we can presstab
to apply the changes andescape
to reject the changes.Test plan
Updated CI checks and manual testing. Please see a demo below
How.to.run.the.autoedits.mp4