Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reimplement agent lifecycle management #303

Merged
merged 7 commits into from
Jan 17, 2024

Conversation

pkukielka
Copy link
Contributor

@pkukielka pkukielka commented Jan 15, 2024

Test plan

This PR will require extensive testing of all existing features.
We need to go through whole https://github.com/sourcegraph/jetbrains/blob/main/TESTING.md
Also ideally use this changes locally for some time to try cach all kind of timing/lifecycle bugs.

Changes

Agent lifecycle is now managed fully by CodyAgentService.

Classes using agent DO NOT try to validate it state, restart, wait few seconds before doing some action, etc.
They also not do any assumptions about agent state anymore.
All they know/get is a CompletableFuture which will contain the agent when it's ready.

If agent will loose connection or crash in some other way it should be restarted automatically by CodyAgentService.

Other than that lifetime management and accesability of several objects components (like CodyAgentCodebase or server object) were changed.

Also, maybe not a huge win but:
30 files changed, 576 insertions(+), 659 deletions(-)
:)

@@ -1,20 +0,0 @@
package com.sourcegraph.config;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was not used in any place.

@pkukielka pkukielka force-pushed the pkukielka/reimplement-agent-lifecycle-management branch from 379d969 to 6a2ff15 Compare January 15, 2024 14:52
@exigow exigow requested a review from a team January 15, 2024 15:00
CodyAgent.getInitializedServer(source.getProject())
// The timeout has been increased from 3 to 12.
// This more like workaround than a fix to:
// https://github.com/sourcegraph/jetbrains/issues/169
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume we can say that this PR actually fixes: #169

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a good chance for that.
Fas there any deterministic reproduction for the issue above?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am afraid there was not 😕 I could have been related to the network

if (ConfigUtil.isCodyEnabled() && agentServer != null) {
agentServer.configurationDidChange(ConfigUtil.getAgentConfiguration(project))
CodyAgentService.applyAgentOnBackgroundThread(project) { agent ->
if (ConfigUtil.isCodyEnabled()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so previously depending on this boolean we were starting or stopping the agent. Is the behavior different now? so is the agent always started?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is something I take liberty to remove but we have to test it. Agent is always started (or if not yet CodyAgentService.applyAgentOnBackgroundThread will wait for it to be).
I think starting/stoping should take place when cody settings change, as that is where you can enable/disable cody (and we have proper code for that in settings).

But if someone see a reason why that code was needed there I will be happy to hear it.

.thenCompose { agent ->
agent.server.evaluateFeatureFlag(GetFeatureFlag("CodyProJetBrains"))
}
// TODO We should avoid using getNow if possible
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we have an issue for this one? or does it make sense to just add timeout(1, SECONDS)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Timeout does not change anything there, because there is no quarantee if it will take 0, 2 or 10 sec.
I will create an issue.

val tooltip = embeddingStatus.getTooltip(project)
if (tooltip.isNotEmpty()) {
embeddingStatusContent.setToolTipText(tooltip)
ApplicationManager.getApplication().invokeLater {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see all the logic is wrapped in invokeLater. Shouldn't be a caller's responsibility to call it on a proper thread? We could then mark it as @RequiresEdt.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have mixing feelings about that, let's discuss on the next standup.
I think this can go as it is for now.

import javax.annotation.concurrent.GuardedBy

@Service(Service.Level.APP)
class CodyAgentService : Disposable {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

beautiful 😍

// Agent cannot gracefully recover when connection is lost, we need to restart it
logger.warn("Failed to load new chat, restarting agent", e)
CodyAgentService.getInstance(project).restartAgent(project)
Thread.sleep(5000)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it needed to sleep here? startAgent has it's 15s timeout already ⌚

Copy link
Contributor Author

@pkukielka pkukielka Jan 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to avoid constant heavier utilisation of the CPU in case of the lack of the network.
In fact it would be better to have some listener which would say if network is back, I will check if we can do that. I will also move that code to agent service.

if (server == null) {
setRecipesPanelError()
}
CodyAgentService.applyAgentOnBackgroundThread(project) { agent ->
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we can simplify the code here even more. Let's stick to the rule that it is the caller's responsibility to use a proper thread, hence:

  1. remove CodyAgentService.applyAgentOnBackgroundThread(project) { ... } wrapping here
  2. remove ApplicationManager.getApplication().executeOnPooledThread { ... } wrapping above (line 225, in refreshRecipes)
  3. revamp loadCommands so it takes agent as a param
  4. wrap loadCommands(agent) into CodyAgentService.applyAgentOnBackgroundThread(project) { ... } in line 225, refeshRecipes)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I agree that we can simplify this code a bit
  2. I do not agree with Let's stick to the rule that it is the caller's responsibility to use a proper thread as a general rule, it can lead to too many errors/omissions. That is, unless we explicitly wrap methods bory in something like CompletableFuture, but with a distinct subtypes for EDT and Background (which I think could work nicely but it's a bit additional work and I'm not sure if it's currently worth it).
  3. I'm happy to discuss this on the standup or a dedicated call.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, feel free to do it your way

if (isFirstMessage.compareAndSet(false, true)) {
val contextMessages =
agentChatMessage.contextFiles
?.stream()
Copy link
Contributor

@exigow exigow Jan 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can remove stream + collect

private constructor(
val client: CodyAgentClient,
val server: CodyAgentServer,
val launcher: Launcher<CodyAgentServer>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This lanucher is unused - can we remove that?


fun initializeRepoName() {
onRepositoryNameChange(RepoUtil.findRepositoryName(project, null))
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method should be placed below class fields.

@@ -9,19 +11,18 @@ import com.sourcegraph.common.ProjectFileUtils
import com.sourcegraph.config.ConfigUtil
import com.sourcegraph.vcs.RepoUtil

class CodyAgentCodebase(private val underlying: CodyAgentServer, val project: Project) {
@Service(Service.Level.PROJECT)
class CodyAgentCodebase(val project: Project) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Project should be private.

Copy link
Contributor

@exigow exigow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good 👍.

I've added some minor comments and I'm looking forward for Cody commit bump to start testing. I understand the idea of these changes, especially after your presentation. I like this CodyAgentService which really simplifies usages (as confirmed by the net number of lines) and and forces us to think more about threading.

Copy link
Contributor

@Sa1to Sa1to left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🎉 FYI rebased #183 on this branch

@pkukielka pkukielka merged commit 77e5c2f into main Jan 17, 2024
2 checks passed
@pkukielka pkukielka deleted the pkukielka/reimplement-agent-lifecycle-management branch January 17, 2024 09:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants