Skip to content

Multithread processing source files #243

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,9 @@ as well as the following command line options:
* `-i/--in-place`: Overwrites the input files when formatting instead of
printing the results to standard output.

* `-p/--parallel`: Process files in parallel, simultaneously across
multiple cores.

* `-r/--recursive`: If specified, then the tool will process `.swift` source
files in any directories listed on the command line and their descendants.
Without this flag, it is an error to list a directory on the command line.
Expand Down
43 changes: 28 additions & 15 deletions Sources/swift-format/Frontend/Frontend.swift
Original file line number Diff line number Diff line change
Expand Up @@ -92,7 +92,7 @@ class Frontend {
if paths.isEmpty {
processStandardInput()
} else {
processPaths(paths)
processPaths(paths, parallel: lintFormatOptions.parallel)
}
}

Expand Down Expand Up @@ -124,28 +124,41 @@ class Frontend {
}

/// Processes source content from a list of files and/or directories provided as paths.
private func processPaths(_ paths: [String]) {
private func processPaths(_ paths: [String], parallel: Bool) {
precondition(
!paths.isEmpty,
"processPaths(_:) should only be called when paths is non-empty.")

for path in FileIterator(paths: paths) {
guard let sourceFile = FileHandle(forReadingAtPath: path) else {
diagnosticEngine.diagnose(Diagnostic.Message(.error, "Unable to open \(path)"))
continue
if parallel {
let allFilePaths = Array(FileIterator(paths: paths))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rambling: I still have some concerns about the upfront flattening of the FileIterator being slow if the formatter is run in parallel on something like a networked file system or in recursive mode on the root of a large monorepo (who would do such a thing), but I should probably test my concerns against real data before going to the trouble of adding more work.

If we need to change this in the future, I was concerned about having to do our own thread pooling, but maybe there's an easy alternative here: instead of flattening the whole iterator at once, just pull off a both sufficiently large and sufficiently small number of files at a time (say, 200, or a nice round number like 256) and loop over the batches while processing each batch with concurrentPerform. That would probably still get us good parallelism in the batches while avoiding eagerly flattening the whole array.

You don't need to make this change now (I might be worried about the performance for nothing), but I just wanted to write it out here so I don't forget it later if we do need it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 good thoughts, I would definitely be interested to hear if you see issues with this given the large number of files.

DispatchQueue.concurrentPerform(iterations: allFilePaths.count) { index in
let path = allFilePaths[index]
openAndProcess(path)
}

guard let configuration = configuration(
atPath: lintFormatOptions.configurationPath, orInferredFromSwiftFileAtPath: path)
else {
// Already diagnosed in the called method.
continue
} else {
for path in FileIterator(paths: paths) {
openAndProcess(path)
}
}
}

/// Read and process the given path, optionally synchronizing diagnostic output.
private func openAndProcess(_ path: String) -> Void {
guard let sourceFile = FileHandle(forReadingAtPath: path) else {
diagnosticEngine.diagnose(Diagnostic.Message(.error, "Unable to open \(path)"))
return
}

let fileToProcess = FileToProcess(
fileHandle: sourceFile, path: path, configuration: configuration)
processFile(fileToProcess)
guard let configuration = configuration(
atPath: lintFormatOptions.configurationPath, orInferredFromSwiftFileAtPath: path)
else {
// Already diagnosed in the called method.
return
}

let fileToProcess = FileToProcess(
fileHandle: sourceFile, path: path, configuration: configuration)
processFile(fileToProcess)
}

/// Returns the configuration that applies to the given `.swift` source file, when an explicit
Expand Down
6 changes: 6 additions & 0 deletions Sources/swift-format/Subcommands/LintFormatOptions.swift
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,12 @@ struct LintFormatOptions: ParsableArguments {
""")
var ignoreUnparsableFiles: Bool = false

/// Whether or not to run the formatter/linter in parallel.
@Flag(
name: .shortAndLong,
help: "Process files in parallel, simultaneously across multiple cores.")
var parallel: Bool = false

/// The list of paths to Swift source files that should be formatted or linted.
@Argument(help: "Zero or more input filenames.")
var paths: [String] = []
Expand Down