-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Running heavy filesystem operations in separate processes #899
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It sounds like it would be a good idea in general to run watching processes in separated threads. Could you generalize the forwarding so that we can reuse it more easily for the git watcher, too?
this.proxy.onDidOpenConnection(() => this.reconnect()); | ||
const onInitialized = this.proxy.onDidOpenConnection(() => { | ||
onInitialized.dispose(); | ||
this.proxy.onDidOpenConnection(() => this.reconnect()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you explain why this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the backend is restarted then the frontend should reconnect to it.
If the chokidar watcher process is restarted then the backend should reconnect to it.
Here it skips the first connection and on following connections does reconnection. We used to have duplicate Started watching
because the first connection was not skipped.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, thanks. A comment in code could help me next time I need to understand it :)
84cfb48
to
6e36d97
Compare
Watching for git is cheap, locating of repositories is expensive. I've extracted the common ipc json-rpc connection provider and used it for filesystem watching and running repositories look up in separate processes. Also, repositories look up now consists of 2 phases:
|
Please review. @hexa00 Could you test that the app is now responsive even with a user home as a workspace root? |
I'll take care of the Windows verification. |
@kittaakos if you can have a look that there are no dangling processes on the server shutdown on windows, it would be nice. |
6e36d97
to
8b9d5be
Compare
silent: true, | ||
env: { | ||
...process.env | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just asking to squeeze some additional knowledge from you: it is the same as env: process.env
, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, it copies properties to a new object, later 2 more properties added to the new object but not to process.env
29e17e5
to
a8389fa
Compare
if (repository && repository2) { | ||
return repository.localUri === repository2.localUri; | ||
} | ||
return repository === repository2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is undefined === undefined
. Maybe the first arg doesn't need to be optional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is a client which can pass both undefined
: https://github.com/theia-ide/theia/blob/a8389fa95a6132ef6af13c0c0f7de9f5846ca05b/packages/git/src/browser/git-repository-provider.ts#L73
* - If no repositories are available, leaves the selected one as `undefined`. | ||
* - If the previously selected one, does not exist among the most recently discovered one, selects the first one. | ||
* - This method blocks, if the workspace root is not yet set. | ||
*/ | ||
async refresh(): Promise<void> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't it a bit simpler to have a limit
option. Which we set to 1
only when initially called from constructor and later always fetch all repos (replacing everything). I think we wouldn't need all the logic below nor the notion of a containingRepository
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What should be returned if there is no containing repo, the first contained repo?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And I think the logic will be needed anyway, otherwise it will flicker from one repository to several
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should just return the first n results and then stop. Why would it flicker?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It shows several repositories, you press refresh, it shows one repository, then it shows several repositories again.
The UI is responsive !:) I get these errors in the browser console however:
Strangly I get much more errors like this in the backend but these find their way to the frontend |
However once the nodejs memory limit is reached the other process is at 100% cpu for minutes now... But that's another problem I guess.. still it's a very good step forward. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll continue to review the code tomorrow, I'm out of time today
} | ||
|
||
protected readonly restarts: number[] = []; | ||
shouldRestart(): string | undefined { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems this should be shouldRestart ( { reason: string }): boolean ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or I'm thinking in C too much and it should be shouldResart() : { success: boolean, reason: string }
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've changed with logging in the error handler.
@hexa00 I think they are from our filesystem, not from the watcher. Could you file an issue that the filesystem should handle EACCES please? |
Ideas what we can do about it? Should we have timeout and stop the operation? Like if watching cannot be started during 5mins? |
a8389fa
to
a29b806
Compare
We've discussed to limit deep of git repositories look up and time out the startup of file watching. |
I think best would be a max number of watches, if we can determine the memory taken by a watch we could limit it so that we don't use >70% of the available node memory for that? @simark since you've checked this before would you have an idea on those numbres (memory used per watch ? / max number of watch to allow) |
Actually I think watching a directory can be fast it's just when the memory limit is reached everything is slow in nodejs as it tries to free memory to allocate memory all the time. Feels like the equivalent of a system using it's swap. We should guard against that in general in theia I think otherwise just opening a big file in the editor may have the same result. |
I don't get it, there is only one watcher for the workspace root per a window.
It takes 10s to start up for Theia, i meant if it cannot start up for example for 1 min then timeout the process.
It is different contexts, Monaco can handle big files, e.g. average of the file in TS repo 20 LOC. I would like to go with the simple approach now and if it does not work well, reconsider it later. |
There are also watchers for preferences and package.json files, but they are cheap. We can optimize by watching single files by one watcher or watching the same repo by one watcher for all clients. But I would do it in separate PRs as we experience issues. |
I've removed a constraint by depth on git and used https://github.com/implausible/find-git-repositories. It does not burn CPU time and I need about 3 mins to discover all git repositories on my machine (250G of data). |
e29262b
to
1330067
Compare
Signed-off-by: Anton Kosiakov <anton.kosyakov@typefox.io>
Signed-off-by: Anton Kosiakov <anton.kosyakov@typefox.io>
separate process Signed-off-by: Anton Kosiakov <anton.kosyakov@typefox.io>
b42dfaa
to
2bd7958
Compare
I am trying it out on Windows now. Please note, I have only two cores for the image, so that will be a real stress-test. |
|
I have tried out Update: it does not happen if I use |
import { ChokidarFileSystemWatcherServer } from './chokidar-filesystem-watcher'; | ||
import { NsfwFileSystemWatcherServer } from './nsfw-watcher/nsfw-filesystem-watcher'; | ||
|
||
// tslint:disable:no-unused-expression |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you have already cleaned up the test setup, perhaps, you could get rid of chai-as-promised
too, and use async
, can you? (First, let's have green tests on the CI.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Follow-up: #928
2bd7958
to
5f7a52f
Compare
There are some quirks with nsfw, like it cannot watch symlinked directories, but it is going to be fixed and pushed by vscode developers. I would like to go with nsfw anyway, it is quite an improvement compared to chokidar. |
Signed-off-by: Anton Kosiakov <anton.kosyakov@typefox.io>
…nig by depth Signed-off-by: Anton Kosiakov <anton.kosyakov@typefox.io>
5f7a52f
to
50274d5
Compare
Builds are green! Objections to merge? @svenefftinge I've changed quite a lot since your last review. |
Good with me. Only detail could we document somewhere that the symlinks don't work ? |
Or report it to the user ? |
I will wait till someone hits it. There are also other issues, you can look at https://github.com/Axosoft/nsfw/issues. It is big effort trying to prevent all of them while Theia is not used extensively. if we will often stumble on them I am fine to start thinking about it. Also, I am not sure that we can detect it since all important code is native, not js. |
OK fine with me with luck it will be fixed by the time it's really an issue. |
No description provided.