Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: IntelliJ plugin filled up disk with agent files #439

Closed
gabtorre opened this issue Jan 30, 2024 · 5 comments · Fixed by #566
Closed

bug: IntelliJ plugin filled up disk with agent files #439

gabtorre opened this issue Jan 30, 2024 · 5 comments · Fixed by #566
Assignees
Labels
bug Something isn't working gtm-input

Comments

@gabtorre
Copy link

Installation Information

Cody plugin version 5.2.20247

IntelliJ IDEA 2023.3.2 (Ultimate Edition)
Build #IU-233.13135.103, built on December 20, 2023
Runtime version: 17.0.9+7-b1087.9 aarch64
VM: OpenJDK 64-Bit Server VM by JetBrains s.r.o.
macOS 14.3

Describe the bug

The plugin seemed stuck in a loop, leading to excessive disk usage. Additionally, it created large temporary copies of the agent file during this process, with many of these logs:

2024-01-30 09:34:58,888 [86661021]   INFO - #CodyAgent - extracting Cody agent binary to /var/folders/z4/qq9brd3x7y9gxs52nc_3x6sm0000gn/T/cody-agent8534747379058237572
2024-01-30 09:34:58,920 [86661053]   INFO - #CodyAgent - starting Cody agent /var/folders/z4/qq9brd3x7y9gxs52nc_3x6sm0000gn/T/cody-agent8534747379058237572

To mitigate the issue, the user cleared all the files associated with the plugin, restarted the macOS, and then reenabled the plugin. After these actions, the plugin behaves correctly, with only two copies of the agent created.

It seems like the plugin may not clean up after itself properly, as indicated by the excessive number of temporary copies created during the loop.

Expected behavior

The plugin cleans up temporary files appropriately to prevent excessive disk usage.

Additional context (logs, images, etc)

Additional context: https://sourcegraph.slack.com/archives/C02842Z6YAK/p1706567669821759

2024-01-30 09:34:53,884 [86656017]   INFO - #CodyAgent - extracting Cody agent binary to /var/folders/z4/qq9brd3x7y9gxs52nc_3x6sm0000gn/T/cody-agent15487046631569367214
2024-01-30 09:34:53,918 [86656051]   INFO - #CodyAgent - starting Cody agent /var/folders/z4/qq9brd3x7y9gxs52nc_3x6sm0000gn/T/cody-agent15487046631569367214
2024-01-30 09:34:55,094 [86657227]   WARN - #com.sourcegraph.cody.agent.CodyAgentClient - Cody by Sourcegraph: █ getRgPath: Did not find bundled `rg`.
2024-01-30 09:34:55,094 [86657227]   WARN - #com.sourcegraph.cody.agent.CodyAgentClient - Cody by Sourcegraph: █ VSCodeSecretStorage:getAccessToken: failed 
2024-01-30 09:34:55,094 [86657227]   WARN - #com.sourcegraph.cody.agent.CodyAgentClient - Cody by Sourcegraph: █ CommandsController:fileWatcherInit: watchers created
2024-01-30 09:34:55,094 [86657227]   WARN - #com.sourcegraph.cody.agent.CodyAgentClient - Cody by Sourcegraph: █ AuthProvider:init:lastEndpoint: https://sourcegraph.com/
2024-01-30 09:34:55,103 [86657236]   WARN - #com.sourcegraph.cody.agent.CodyAgentClient - Cody by Sourcegraph: █ CodyIgnore:refresh:workspace: file:///Users/foo/bar/baz/.ijwb/
2024-01-30 09:34:55,604 [86657737]   WARN - #com.sourcegraph.cody.agent.CodyAgentClient - Cody by Sourcegraph: █ ChatManager:constructor: init no local embeddings
2024-01-30 09:34:55,604 [86657737]   WARN - #com.sourcegraph.cody.agent.CodyAgentClient - Cody by Sourcegraph: █ ChatPanelsManager:constructor: init
2024-01-30 09:34:55,604 [86657737]   WARN - #com.sourcegraph.cody.agent.CodyAgentClient - Cody by Sourcegraph: █ VSCodeSecretStorage:getAccessToken: failed 
2024-01-30 09:34:55,604 [86657737]   WARN - #com.sourcegraph.cody.agent.CodyAgentClient - Cody by Sourcegraph: █ vscode.commands.executeCommand: not found cody.chat.focus
2024-01-30 09:34:55,604 [86657737]   WARN - #com.sourcegraph.cody.agent.CodyAgentClient - Cody by Sourcegraph: █ CodyCompletionProvider:notSignedIn: You are not signed in.
2024-01-30 09:34:57,037 [86659170]   WARN - #com.sourcegraph.cody.agent.CodyAgentClient - Cody by Sourcegraph: █ ContextProvider:onConfigurationChange: using codebase github.com/foo/bar
2024-01-30 09:34:57,037 [86659170]   WARN - #com.sourcegraph.cody.agent.CodyAgentClient - Cody by Sourcegraph: █ CodyCompletionProvider:notSignedIn: You are not signed in.
2024-01-30 09:34:57,304 [86659437]   WARN - #com.sourcegraph.cody.agent.CodyAgentClient - Cody by Sourcegraph: █ Cody:publishConfig: configForWebview 
2024-01-30 09:34:57,304 [86659437]   WARN - #com.sourcegraph.cody.agent.CodyAgentClient - Cody by Sourcegraph: █ ContextProvider:onConfigurationChange: using codebase github.com/foo/bar
2024-01-30 09:34:57,305 [86659438]   WARN - #com.sourcegraph.cody.agent.CodyAgentClient - Cody by Sourcegraph: █ logEvent: CodyJetBrainsPlugin:Auth:failed JetBrains {"opts":{"agent":true}}
2024-01-30 09:34:57,305 [86659438]   WARN - #com.sourcegraph.cody.agent.CodyAgentClient - Cody by Sourcegraph: █ telemetry-v2: recordEvent: cody.auth/failed: {"parameters":{"version":0,"metadata":[{"key":"contextSelection","value":2},{"key":"guardrails","value":0}]},"timestamp":"2024-01-29T20:34:57.300Z"}
2024-01-30 09:34:57,305 [86659438]   WARN - #com.sourcegraph.cody.agent.CodyAgentClient - Cody by Sourcegraph: █ CodyCompletionProvider:notSignedIn: You are not signed in.
2024-01-30 09:34:57,327 [86659460]   INFO - #CodyAgent - Connected to Cody agent cody-agent
2024-01-30 09:34:57,565 [86659698]   WARN - #com.sourcegraph.cody.agent.CodyAgentClient - Cody by Sourcegraph: █ GraphQLTelemetryExporter: evaluated export mode: 5.2.2-5.2.3
2024-01-30 09:34:58,886 [86661019]   WARN - #com.sourcegraph.cody.agent.CodyAgentClient - Cody by Sourcegraph: █ ChatPanelsManager:createWebviewPanel: 0
2024-01-30 09:34:58,887 [86661020]   WARN - #CodyAgent - Cody Agent shut down
2024-01-30 09:34:58,888 [86661021]   INFO - #CodyAgent - extracting Cody agent binary to /var/folders/z4/qq9brd3x7y9gxs52nc_3x6sm0000gn/T/cody-agent8534747379058237572
2024-01-30 09:34:58,920 [86661053]   INFO - #CodyAgent - starting Cody agent /var/folders/z4/qq9brd3x7y9gxs52nc_3x6sm0000gn/T/cody-agent8534747379058237572
... repeats
2024-01-30 08:17:01,833 [81983966]   WARN - #CodyAgent - Unable to start Cody agent
com.sourcegraph.cody.agent.CodyAgentException: failed to create agent binary
Caused by: java.nio.file.FileSystemException: /Users/foo/Library/Application Support/JetBrains/IntelliJIdea2023.3/plugins/Sourcegraph/agent/agent-macos-arm64 -> /var/folders/z4/qq9brd3x7y9gxs52nc_3x6sm0000gn/T/cody-agent7707303340069661988: No space left on device
	at java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:100)
	at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:106)
	at java.base/sun.nio.fs.UnixCopyFile.copyFile(UnixCopyFile.java:256)
	at java.base/sun.nio.fs.UnixCopyFile.copy(UnixCopyFile.java:603)
	at java.base/sun.nio.fs.UnixFileSystemProvider.copy(UnixFileSystemProvider.java:257)
	at java.base/java.nio.file.Files.copy(Files.java:1305)
	at CodyAgent$Companion.agentBinary(CodyAgent.kt:180)
	at CodyAgent$Companion.startAgentProcess(CodyAgent.kt:86)
	at CodyAgent$Companion.create(CodyAgent.kt:57)
	at com.sourcegraph.cody.agent.CodyAgentService.startAgent$lambda$0(CodyAgentService.kt:37)
	at com.intellij.openapi.application.impl.ApplicationImpl$2.run(ApplicationImpl.java:249)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
	at com.intellij.util.concurrency.ContextCallable.call(ContextCallable.java:32)
	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
	at com.intellij.util.concurrency.ContextRunnable.run(ContextRunnable.java:27)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
	at java.base/java.util.concurrent.Executors$PrivilegedThreadFactory$1$1.run(Executors.java:702)
	at java.base/java.util.concurrent.Executors$PrivilegedThreadFactory$1$1.run(Executors.java:699)
	at java.base/java.security.AccessController.doPrivileged(AccessController.java:399)
	at java.base/java.util.concurrent.Executors$PrivilegedThreadFactory$1.run(Executors.java:699)
	at java.base/java.lang.Thread.run(Thread.java:840)
@gabtorre gabtorre added the bug Something isn't working label Jan 30, 2024
@mkondratek
Copy link
Contributor

Hello @gabtorre 👋 I am sorry you suffer from this issue. Thanks for the detailed report 💟

I understand that the plugin works correctly for you now (after the manual cleanup).

We've seen a similar issue recently: #315

Are you able to point the most common message in the logs? Is it: WARN - #com.sourcegraph.cody.CodeFleEditorListener - server is null?

Hopefully, the newest version is not responsible for the wordy logging. We will revisit this issue and verify it.

@gabtorre
Copy link
Author

Hey @mkondratek, In this case, rather than log files, there were thousands of cody-agent7707303340069661988 or similar files from /var/folders/z4/qq9brd3x7y9gxs52nc_3x6sm0000gn/T.

@morgangauth
Copy link

I just received another report of this happening at another customer . Slack thread can be found here.

@olafurpg
Copy link
Member

olafurpg commented Feb 8, 2024

Thank you for reporting! This is a serious issues that we should prioritize fixing asap. This is not a recent regression, this behavior has been latent in the plugin since September. The root problem is that we never delete the temporary file that's created here

https://sourcegraph.com/github.com/sourcegraph/jetbrains@a5a1627c33cd4223ec28ffaeed393e87fd3de7e3/-/blob/src/main/kotlin/com/sourcegraph/cody/agent/CodyAgent.kt?L232

Adding logic to delete the file after the process finishes running should be an easy fix so I think it's safe to estimate that this will get resolved before the February 15th launch.

olafurpg added a commit that referenced this issue Feb 8, 2024
Addresses #439

Previously, the JetBrains plugin created a new temporary file every time
we started a new agent process and only deleted this file when the IDE
shut down. This buggy behavior has existed since Septemeber and we
haven't received user complaints about it, likely because the plugin
was conservative about restarting the agent process. Recent changes
introduced retry logic where we could start a new agent process in a
loop, and never delete the underlying binary. Some users have reported
that this bug caused the plugin to take up unreasonable amounts of disk
space. I'm sorry for users who were impacted by this behavior.

This PR addresses the problem with a new callback hook that deletes the
agent binary after the agent process finishes running. This change means
that we never have more than one agent binary sitting on the disk per
IntelliJ project. This should prevent the catastrophic situation some
users have reported.

I manually tested this change by adding a println to when the file gets
deleted, started IntelliJ, triggered the "Restart Cody Agent" action and
confirmed that the file got deleted.
olafurpg added a commit that referenced this issue Feb 8, 2024
Addresses #439

Previously, the JetBrains plugin created a new temporary file every time
we started a new agent process and only deleted this file when the IDE
shut down. This buggy behavior has existed since Septemeber and we
haven't received user complaints about it, likely because the plugin
was conservative about restarting the agent process. Recent changes
introduced retry logic where we could start a new agent process in a
loop, and never delete the underlying binary. Some users have reported
that this bug caused the plugin to take up unreasonable amounts of disk
space. I'm sorry for users who were impacted by this behavior.

This PR addresses the problem with a new callback hook that deletes the
agent binary after the agent process finishes running. This change means
that we never have more than one agent binary sitting on the disk per
IntelliJ project. This should prevent the catastrophic situation some
users have reported.

I manually tested this change by adding a println to when the file gets
deleted, started IntelliJ, triggered the "Restart Cody Agent" action and
confirmed that the file got deleted.
olafurpg added a commit that referenced this issue Feb 8, 2024
Addresses #439

Previously, the JetBrains plugin created a new temporary file every time
we started a new agent process and only deleted this file when the IDE
shut down. This buggy behavior has existed since Septemeber and we
haven't received user complaints about it, likely because the plugin was
conservative about restarting the agent process. Recent changes
introduced retry logic where we could start a new agent process in a
loop, and never delete the underlying binary. Some users have reported
that this bug caused the plugin to take up unreasonable amounts of disk
space. I'm sorry for users who were impacted by this behavior.

This PR addresses the problem with a new callback hook that deletes the
agent binary after the agent process finishes running. This change means
that we never have more than one agent binary sitting on the disk per
IntelliJ project. This should prevent the catastrophic situation some
users have reported.


## Test plan
I manually tested this change by adding a println to when the file gets
deleted, started IntelliJ, triggered the "Restart Cody Agent" action and
confirmed that the file got deleted.

<!-- All pull requests REQUIRE a test plan:
https://sourcegraph.com/docs/dev/background-information/testing_principles

Why does it matter?

These test plans are there to demonstrate that are following industry
standards which are important or critical for our customers.
They might be read by customers or an auditor. There are meant be simple
and easy to read. Simply explain what you did to ensure
your changes are correct!

Here are a non exhaustive list of test plan examples to help you:

- Making changes on a given feature or component:
- "Covered by existing tests" or "CI" for the shortest possible plan if
there is zero ambiguity
  - "Added new tests"
- "Manually tested" (if non trivial, share some output, logs, or
screenshot)
- Updating docs:
  - "previewed locally"
  - share a screenshot if you want to be thorough
- Updating deps, that would typically fail immediately in CI if
incorrect
  - "CI"
  - "locally tested"
-->
@olafurpg
Copy link
Member

olafurpg commented Feb 9, 2024

This PR #566 should fix the root issue where the plugin is restarting a new (crashing) process in a loop. It's 10pm here right now so we will wait until Monday morning to make this fix available in a pre-release (aka. "nightly channel"). We will be cutting a new release (previously referred to as "stable channel") no later than Tuesday.

pkukielka added a commit that referenced this issue Feb 12, 2024
Fixes #439

##
Changes
* Removed agent recovery loop, recovery is now on demand
* Added a bit of defensive code in case of unexpected exceptions
* Added few timeouts in places where we shouldn't wait forever
* Improved robustness of the shutdown
* Added additional safeguard for old agent binary file deletion

## Test plan
I added Thread that was destroying agent process in a fixed time after
startup.
I tested durations from 100ms to 10s.
Crashes after 200ms (which simulated complete agent failure) were not
spamming the logs with too much info, and were not leaving trash on the
disk.
Delay around 4-5s was enough for the chat to recover and give really
short answer. It was able to recover infinite amount of times, which
means we could survive even repeated crashes or hangs of the agent (if
that would ever be the case).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working gtm-input
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants