Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Measure performance of indexer(s) #1056

Open
9 tasks
radeksimko opened this issue Aug 30, 2022 · 0 comments
Open
9 tasks

Measure performance of indexer(s) #1056

radeksimko opened this issue Aug 30, 2022 · 0 comments
Labels
enhancement New feature or request performance Gotta go fast

Comments

@radeksimko
Copy link
Member

radeksimko commented Aug 30, 2022

Background

The LS (re)indexes files at various stages, as necessary per LSP:

  • initialize
  • textDocument/didOpen
  • textDocument/didChange
  • textDocument/didChangeWatchedFiles
  • workspace/didChangeWorkspaceFolders

The first one in particular is likely the most impactful on resource usage as it likely to index deeper hierarchy with many modules, the other ones are more likely to work with much more limited scope of 1 or few modules only.

We currently run benchmarks of various modules in

-
name: Run benchmarks
id: bench
run: |
go test ./internal/langserver/handlers \
-bench=InitializeFolder_basic \
-run=^# \
-benchtime=60s \
-timeout=60m | tee ${{ runner.temp }}/benchmarks.txt

This however only tells us how some particular publicly available modules perform. Users may or may not use these modules, or any number of any modules and combination of those. To understand the real impact on real end-users we need some way of measuring the performance at runtime (as opposed to in isolation).

Measuring any aggregate performance impact - e.g. by logging total time spent indexing via initialize is currently not possible as indexing happens asynchronously. This includes both the filesystem walking and the actual indexing jobs on discovered modules.

We have a mechanism for tracking job IDs in the walker:

ids, err := w.walkFunc(ctx, dir)
if err != nil {
w.collectError(fmt.Errorf("walkFunc: %w", err))
}
w.collectJobIds(ids)

but this isn't suitable for runtime (yet) - it's only used in tests. In order to use it at runtime we'd need to track those IDs per walked path.

Proposal

  • state: Introduce new PathState: PathStateWalked for WalkerPath and reflect it in all methods
    const (
    PathStateQueued PathState = iota
    PathStateWalking
    )
  • state: Introduce new field to WalkerPath - JobIds job.IDs to track job IDs
  • state: Introduce new field to WalkerPath - RootDir document.Dir to track the original root dir
  • state: Introduce (*WalkerPathStore).WaitForRootDirWalked(dir) (job.IDs, error) to wait until all paths with the given RootDir are walked and return job IDs to wait for
  • state: Introduce new argument to (*WalkerPathStore).EnqueueDir(dir, cleanupMode) - cleanupMode CleanupMode to enable delaying of walked dir removal until it was processed by WaitForDir() or WaitForRootDir()
    • state: Reflect cleanup mode in (*WalkerPathStore).waitForDir()
    • state: Reflect cleanup mode in (*WalkerPathStore).WaitForRootDirWalked()
    • walker: Reflect cleanup mode - i.e. avoid removing paths if cleanup mode is CleanupAfterWait
  • langserver/handlers: Launch a go routine as part of initialize handler to WaitForDir() and also wait for jobs and then log time elapsed between enqueuing and finishing wait
type CleanupMode uint

const (
	CleanupAfterWalk CleanupMode = iota
	CleanupAfterWait
)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request performance Gotta go fast
Projects
None yet
Development

No branches or pull requests

2 participants