Faster tokens resolving in syntaxmap #2916

PaulTaykalo · 2019-10-23T21:36:19Z

When the request is asked which tokens are have in an intersection, the previous solution was searching for first Index (linearly) and then filtered everything that was coming after that index using intersect function

The updated solution will search for the first token index usingbinary search

Speedup

While this is only one function was updated, the next options were considered:

[lin+filter] old solution
[lin+prefix] old solution with prefix:wihle instead of filtering
[bin+filter] binary search with filter
[bin+prefix] binary search with prefix:wihle instead of filtering

The speedup highly depends on the file sizes. The bigger/longer files the bigger win is

Benchmark

Kickstarter

lin+filter	lin+prefix	bin+filter	bin+prefix	speedup
0.494	0.243	0.390	0.117	~4x

Swift

lin+filter	lin+prefix	bin+filter	bin+prefix	speedup
1.739	0.740	1.273	*0.103*	~16x

WordPress

lin+filter	lin+prefix	bin+filter	bin+prefix	speedup
1.270	0.526	0.918	0.148	~8x

Testing code

This code was tested with these parts of code (in Release build)

fileprivate var counter = 0
fileprivate var times: [String: Double] = [:]
fileprivate let timesQueue = DispatchQueue.init(label: "benchmarks")

fileprivate func timeLog<T>(_ name: String, block: () -> T) -> T {
    let start = DispatchTime.now()
    let result = block()
    let end = DispatchTime.now()
    timesQueue.async {
        let oldValue = times[name, default:0.0]
        let diff = TimeInterval(end.uptimeNanoseconds - start.uptimeNanoseconds) / 1_000_000_000
        let newValue = oldValue + diff
        times[name] = newValue
        counter += 1
        if counter % 1000 * times.count == 0 {
            print("!!!!: \(times)")
        }
    }
    return result
}

    internal func tokens(inByteRange byteRange: NSRange) -> [SyntaxToken] {
        let new = timeLog("new") { tokensFast(inByteRange: byteRange) }
        let new2 = timeLog("old") { tokensOld(inByteRange: byteRange) }
        return arc4random() % 2 == 1 ? new : new2
    }

PaulTaykalo · 2019-10-23T22:06:21Z

Wow. It seems that lazy prefix suffix combination behaves strangely in Swift 5.0/Xcode 10
Will drill it down tomorrow.

PaulTaykalo · 2019-10-23T22:21:49Z

@jpsim any ideas why this could fail on Swift 5.0? :(
Will check this later

Source/SwiftLintFramework/Extensions/SyntaxMap+SwiftLint.swift

SwiftLintBot · 2019-10-23T23:09:15Z

	1 Warning
⚠️	This PR may need tests.

	12 Messages
📖	Linting Aerial with this PR took 2.24s vs 2.28s on master (1% faster)
📖	Linting Alamofire with this PR took 3.98s vs 3.95s on master (0% slower)
📖	Linting Firefox with this PR took 12.74s vs 12.73s on master (0% slower)
📖	Linting Kickstarter with this PR took 29.47s vs 29.61s on master (0% faster)
📖	Linting Moya with this PR took 2.69s vs 2.59s on master (3% slower)
📖	Linting Nimble with this PR took 2.63s vs 2.68s on master (1% faster)
📖	Linting Quick with this PR took 1.09s vs 1.09s on master (0% slower)
📖	Linting Realm with this PR took 3.91s vs 3.89s on master (0% slower)
📖	Linting SourceKitten with this PR took 1.76s vs 1.74s on master (1% slower)
📖	Linting Sourcery with this PR took 4.65s vs 4.6s on master (1% slower)
📖	Linting Swift with this PR took 20.75s vs 21.04s on master (1% faster)
📖	Linting WordPress with this PR took 32.6s vs 32.91s on master (0% faster)

Generated by 🚫 Danger

PaulTaykalo · 2019-10-24T20:40:49Z

@marcelofabri updated. Works without even checking for Swift version.

Wordpress

["new": 0.5954486219999982, "old": 2.635693253000027]

Swift

["new": 0.475975328999993, "old": 3.679700348999993]

jpsim · 2019-10-24T22:01:56Z

Thanks again for your continued investments in SwiftLint performance @PaulTaykalo.

However, I want to caution against over-optimizing parts of SwiftLint's runtime that already have a negligible impact on total lint time.

In this case we're significantly increasing the complexity of this code, but because finding token indices is such a small part of the total work spent during linting, there ends up being no measurable improvement on linting overall.

Here's what I measured before & after this change on a large project (~450k lines of Swift):

$ hyperfine -w 1 'swiftlint-faster-tokens-resolving lint --no-cache --quiet' 'swiftlint-master lint --no-cache --quiet'
Benchmark #1: swiftlint-faster-tokens-resolving lint --no-cache --quiet
  Time (mean ± σ):     99.046 s ±  3.707 s    [User: 361.074 s, System: 17.021 s]
  Range (min … max):   93.193 s … 103.159 s    10 runs
 
Benchmark #2: swiftlint-master lint --no-cache --quiet
  Time (mean ± σ):     93.638 s ±  3.545 s    [User: 342.293 s, System: 15.224 s]
  Range (min … max):   90.103 s … 101.004 s    10 runs
 
Summary
  'swiftlint-master lint --no-cache --quiet' ran
    1.06 ± 0.06 times faster than 'swiftlint-faster-tokens-resolving lint --no-cache --quiet'

Maybe if we extracted the binary search logic into an extension on Collection somewhere else, it could simplify the resulting code while also making it more performant? I'd be more open to accepting the change then, even if the impact on performance would be the same.

I don't want this to discourage your interest in making SwiftLint faster though! I'd recommend profiling complete lint runs to get a better sense of performance hotspots and optimize those for the best value.

PaulTaykalo · 2019-10-24T23:03:39Z

It's not that I'm not doing that :)
I'll check where it can be used in the project.

Technically, this part above looks the same as the previous solution, but instead firstIndex is using firstIndexUsingBinarySearch. Almost the same as func indexAssumingSorted.

Will it work if I put the binary search part in the extension with comments and make it more abstract?

Most of the time is spent on the regexes
The second-most part of the time is spent on the Dictionaries access
I'm fully aware of the bottlenecks. While this speedup is not visible on the whole linting, it's there.
The deviation of the file accesses and sourcekit requests is huge.
P.S. I would suggest to measure this function running time as in PR comment in both versions and check if running times of both versions. (like it did :) I'm just curious what are the timings in your case.

There's IdenticalOperandsRule on the way. without too many regexes. Not sure if will faster though

jpsim · 2019-10-24T23:06:33Z

Thanks for the reply.

Will it work if I put the binary search part in the extension with comments and make it more abstract?

Yes, specifically I think a reusable binary search function on Collection would be pretty helpful.

PaulTaykalo · 2019-10-27T19:12:41Z

@jpsim So, how about this one?

PaulTaykalo · 2019-10-29T01:24:11Z

This is how I measured that this solution is a bit faster than previous one

On Swift Repository

master branch 3.1% of total time

this branch 0.0%

(scroll scroll)

On Kickstarter Repository

master branch 0.7% of total time

this branch 0.0

(scroll scroll)

On Wordpress Repository

master branch 1.4 % of total time

this branch 0.0%

(scroll scroll)

In general - the bigger the files in the project - the more speedup can be achieved.
At least this solution is not slower than the previous one.

You, probably don't see any improvements mostly because of the average sizes of files in the project. This is totally explainable. There technically no improvement in the project I'm working on as well.

jpsim · 2019-11-06T21:39:26Z

Thanks @PaulTaykalo, I've made a few small edits and will merge when CI passes.

jpsim · 2019-11-06T22:05:28Z

There's an issue with OSScheck it didn't run properly, I think it's fall out from running on MacStadium. I'm fixing it now,.

jpsim · 2019-11-06T22:23:47Z

Validating OSSCheck fixes here: #2945

PaulTaykalo force-pushed the feature/faster-tokens-resolving branch from 3162937 to d36994b Compare October 23, 2019 22:11

PaulTaykalo force-pushed the feature/faster-tokens-resolving branch from d36994b to d40b7d9 Compare October 23, 2019 22:34

marcelofabri reviewed Oct 23, 2019

View reviewed changes

Source/SwiftLintFramework/Extensions/SyntaxMap+SwiftLint.swift Outdated Show resolved Hide resolved

PaulTaykalo force-pushed the feature/faster-tokens-resolving branch from 364cbdf to 743abea Compare October 24, 2019 20:26

PaulTaykalo force-pushed the feature/faster-tokens-resolving branch from 4896721 to 73d26bf Compare October 26, 2019 00:31

PaulTaykalo requested review from marcelofabri and jpsim October 26, 2019 01:50

jpsim force-pushed the feature/faster-tokens-resolving branch from 82b224f to f343f84 Compare November 6, 2019 21:39

PaulTaykalo and others added 7 commits November 6, 2019 14:53

Faster tokens resolving in syntaxmap

f39e9c7

Update changelog

adeca49

Fix false psitives for that were found because of tokens order

950c9af

Use firsIndexAssumingSorted when searching in sorted arrays

0f5dc60

update xcodeproj as well

5a4379f

Documentation edits to RandomAccessCollection+Swiftlint.swift

fa5360f

Move changelog entry to appropriate section

ea777be

jpsim force-pushed the feature/faster-tokens-resolving branch from f343f84 to ea777be Compare November 6, 2019 22:53

jpsim approved these changes Nov 6, 2019

View reviewed changes

jpsim merged commit 6175c00 into realm:master Nov 6, 2019

PaulTaykalo deleted the feature/faster-tokens-resolving branch November 6, 2019 23:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Faster tokens resolving in syntaxmap #2916

Faster tokens resolving in syntaxmap #2916

PaulTaykalo commented Oct 23, 2019 •

edited

Loading

PaulTaykalo commented Oct 23, 2019

PaulTaykalo commented Oct 23, 2019 •

edited

Loading

SwiftLintBot commented Oct 23, 2019 •

edited

Loading

PaulTaykalo commented Oct 24, 2019 •

edited

Loading

jpsim commented Oct 24, 2019

PaulTaykalo commented Oct 24, 2019

jpsim commented Oct 24, 2019

PaulTaykalo commented Oct 27, 2019

PaulTaykalo commented Oct 29, 2019

jpsim commented Nov 6, 2019

jpsim commented Nov 6, 2019

jpsim commented Nov 6, 2019

Faster tokens resolving in syntaxmap #2916

Faster tokens resolving in syntaxmap #2916

Conversation

PaulTaykalo commented Oct 23, 2019 • edited Loading

Speedup

Benchmark

Kickstarter

Swift

WordPress

Testing code

PaulTaykalo commented Oct 23, 2019

PaulTaykalo commented Oct 23, 2019 • edited Loading

SwiftLintBot commented Oct 23, 2019 • edited Loading

PaulTaykalo commented Oct 24, 2019 • edited Loading

Wordpress

Swift

jpsim commented Oct 24, 2019

PaulTaykalo commented Oct 24, 2019

jpsim commented Oct 24, 2019

PaulTaykalo commented Oct 27, 2019

PaulTaykalo commented Oct 29, 2019

On Swift Repository

On Kickstarter Repository

On Wordpress Repository

jpsim commented Nov 6, 2019

jpsim commented Nov 6, 2019

jpsim commented Nov 6, 2019

PaulTaykalo commented Oct 23, 2019 •

edited

Loading

PaulTaykalo commented Oct 23, 2019 •

edited

Loading

SwiftLintBot commented Oct 23, 2019 •

edited

Loading

PaulTaykalo commented Oct 24, 2019 •

edited

Loading