Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize NSRange conversion #120

Merged
merged 5 commits into from
Dec 14, 2015
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,11 @@
## Master

##### Enhancements

* Optimize `NSRange` operation.
[Norio Nomura](https://github.com/norio-nomura)
[#120](https://github.com/jpsim/SourceKitten/pull/120)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please refer to the issues this addresses (#119) rather than this PR number (as explained in https://github.com/jpsim/SourceKitten/blob/master/CONTRIBUTING.md).


## 0.7.0

##### Breaking
Expand Down
50 changes: 20 additions & 30 deletions Source/SourceKittenFramework/String+SourceKitten.swift
Original file line number Diff line number Diff line change
Expand Up @@ -80,11 +80,17 @@ extension NSString {
*/
public func byteRangeToNSRange(start start: Int, length: Int) -> NSRange? {
let string = self as String
return string.indexOfByteOffset(start).flatMap { stringStart in
return string.indexOfByteOffset(start + length).map { stringEnd in
return NSRange(location: stringStart, length: stringEnd - stringStart)
}
let startUTF8Index = string.utf8.startIndex.advancedBy(start)
let endUTF8Index = startUTF8Index.advancedBy(length)

guard let startUTF16Index = startUTF8Index.samePositionIn(string.utf16),
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are utf16 views cached in the standard library? If not, we should cache it here (e.g. just calculate it once).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At first since I filed #119, I tested caching full results of offsets. But that was not efficient.
By realm/SwiftLint#263, times of calling byteRangeToNSRange() is omitted on bottle neck case, so I did not include that.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's not what I mean. I mean are utf16 views on Strings cached by the Swift standard library?

e.g. is it more efficient to do this:

let utf16View = string.utf16
guard let startUTF16Index = startUTF8Index.samePositionIn(utf16View),
      let endUTF16Index = endUTF8Index.samePositionIn(utf16View) else {
        return nil
}

or is that equivalent to this (current code):

guard let startUTF16Index = startUTF8Index.samePositionIn(string.utf16),
      let endUTF16Index = endUTF8Index.samePositionIn(string.utf16) else {
        return nil
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I understand.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested with changes of that and got same performance.
It seems Swift optimizer is clever enough. 😃
But apply them.

let endUTF16Index = endUTF8Index.samePositionIn(string.utf16) else {
return nil
}

let location = string.utf16.startIndex.distanceTo(startUTF16Index)
let length = startUTF16Index.distanceTo(endUTF16Index)
return NSRange(location: location, length: length)
}

/**
Expand All @@ -98,11 +104,17 @@ extension NSString {
*/
public func NSRangeToByteRange(start start: Int, length: Int) -> NSRange? {
let string = self as String
return string.byteOffsetAtIndex(start).flatMap { stringStart in
return string.byteOffsetAtIndex(start + length).map { stringEnd in
return NSRange(location: stringStart, length: stringEnd - stringStart)
}
let startUTF16Index = string.utf16.startIndex.advancedBy(start)
let endUTF16Index = startUTF16Index.advancedBy(length)

guard let startUTF8Index = startUTF16Index.samePositionIn(string.utf8),
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same thing about caching utf8 views.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe caching exact results will not be efficient for most use case (I don't know other than SwiftLint) because inputs are matching results and those are almost unique.

But caching latest index might be efficient for SwiftLint.
I will test that.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mean caching the results of this method, but rather storing utf8 in a variable to reuse it here, rather than calling string.utf8 twice.

let endUTF8Index = endUTF16Index.samePositionIn(string.utf8) else {
return nil
}

let location = string.utf8.startIndex.distanceTo(startUTF8Index)
let length = startUTF8Index.distanceTo(endUTF8Index)
return NSRange(location: location, length: length)
}

/**
Expand Down Expand Up @@ -188,28 +200,6 @@ extension NSString {
}

extension String {
/**
UTF16 index equivalent to byte offset.

- parameter offset: Byte offset.

- returns: UTF16 index, if any.
*/
private func indexOfByteOffset(offset: Int) -> Int? {
return utf8.startIndex.advancedBy(offset).samePositionIn(utf16).map(utf16.startIndex.distanceTo)
}

/**
Byte offset equivalent to UTF16 index.

- parameter index: UTF16 index.

- returns: Byte offset, if any.
*/
private func byteOffsetAtIndex(index: Int) -> Int? {
return utf16.startIndex.advancedBy(index).samePositionIn(utf8).map(utf8.startIndex.distanceTo)
}

/// Returns the `#pragma mark`s in the string.
/// Just the content; no leading dashes or leading `#pragma mark`.
public func pragmaMarks(filename: String, excludeRanges: [NSRange], limitRange: NSRange?) -> [SourceDeclaration] {
Expand Down