-
Notifications
You must be signed in to change notification settings - Fork 10.5k
[5.0] DataProtocol ContiguousCollection and new inline Data #21292
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[5.0] DataProtocol ContiguousCollection and new inline Data #21292
Conversation
Please test with following pull request: @swift-ci Please test |
856b5c8
to
4bbf44d
Compare
I've gone ahead and incorporated the changes from #21290 as well here. |
Please test with following pull request: @swift-ci Please test |
Build failed |
Build failed |
(cherry picked from commit d030354)
4bbf44d
to
482e87d
Compare
Please test with following pull request: @swift-ci Please test |
Build failed |
Build failed |
@itaiferber Did someone review this already? |
@eeckstein Not in a formal capacity yet, no. |
@airspeedswift Can you please review this? |
@@ -90,7 +78,7 @@ internal final class _DataStorage { | |||
} | |||
} | |||
|
|||
@usableFromInline | |||
@inlinable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure this needs to be inlinable? It's not a generic function.
@inline(__always) | ||
public init<S: Sequence>(_ elements: S) where S.Element == UInt8 { | ||
// If the sequence is already contiguous, access the underlying raw memory directly. | ||
if let contiguous = elements as? ContiguousBytes { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious: why is this needed? Are there sequences of UInt8
s which cannot implement withContiguousStorageIfAvailable
but can conform to ContiguousBytes
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes: any sequence of UInt8
's which does not represent typed memory cannot implement withContiguousStorageIfAvailable
without handing over a copy, which defeats what we need to do here. Data
and UnsafeRawBufferPointer
are both examples of such sequences — both have an element type of UInt8
, but represent raw memory, and thus cannot implement withContiguousStorageIfAvailable
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comments mostly are for future clean-up, though I would suggest inlinability could do with another pass to be certain it's justified everywhere. In some places the internals seem to be unnecessarily exposed in functions that don't seem critical to inline (apologies if experimentation has shown otherwise).
block(UnsafeBufferPointer(start: bytePtr, count: effectiveLength - regionAdjustment), region.location + regionAdjustment - _offset, &stopv) | ||
if stopv { | ||
stop.pointee = true | ||
@inlinable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method is pretty huge, and not generic, so seems like there's not a strong reason to inline it. Did you see a noticeable perf difference from marking it inlined? Also it seems like it's the only reason some other functions need to be inlinable/usableFromInline.
guard otherData.count > 0 else { return } | ||
otherData.withUnsafeBytes { | ||
append($0.baseAddress!, length: $0.count) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indenting nit
} | ||
|
||
@_fixed_layout | ||
public struct Data : ReferenceConvertible, Equatable, Hashable, RandomAccessCollection, MutableCollection, RangeReplaceableCollection, MutableDataProtocol, ContiguousBytes { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd recommend breaking this up some time into individual extensions for each conformance. We've found this helps readability of the code a fair bit in the std lib.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree — I didn't want to take on the task right now, but we should do this in the future.
@inlinable | ||
var count: Int { | ||
get { | ||
return numericCast(length) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No real need to use numericCast
when you know both types in various places here. This should be Int(length)
. Being explicit about what you're converting to is clearer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll change this too
return numericCast(length) | ||
} | ||
set(newValue) { | ||
precondition(newValue <= MemoryLayout<Buffer>.size) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is internal, should it be an assert?
} | ||
|
||
@inlinable | ||
var hashValue: Int { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the goal here having Data's hashValue match NSData's? If not, you should pass the value through Swift's hashing mechanism to ensure it gets randomly seeded.
//===--- DataProtocol Extensions ------------------------------------------===// | ||
|
||
extension DataProtocol { | ||
public func firstRange<D: DataProtocol>(of data: D) -> Range<Index>? { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are generic and entirely protocol-based helpers so should be inlinable.
if range.upperBound == 0 { | ||
self = .empty | ||
} else if InlineData.canStore(count: range.upperBound) { | ||
precondition(range.lowerBound <= endIndex, "index \(range.lowerBound) is out of bounds of \(startIndex)..<\(endIndex)") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Am I missing why the preconditions apply here but not below? Also could they be hoisted, they all seem the same.
#endif | ||
|
||
#if DEPLOYMENT_RUNTIME_SWIFT | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
similar comment to other hashValue, ideally this would use Swift's randomization
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure why GitHub decided to make my comment here about a deleted line instead of the code on the rhs...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
weird, it looks ok on the code tab
var upperBound: Int { return range.upperBound } | ||
|
||
@inlinable | ||
var count: Int { return range.upperBound - range.lowerBound } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should be able to replacerange.upperBound - range.lowerBound
with range.count
in a number of places.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re: this and range.isEmpty
— we found during testing that because range
does not implement count
itself (but instead gets the implementation from a default implementation on Collection
), calculating this ourselves was orders of magnitude faster than calling range.count
(dispatching through to the protocol implementation added a lot of overhead), and this was making a big difference in our unit tests.
It does appear that Range
implements isEmpty
directly, so we should be able to fix .isEmpty
above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you file a bug for the optimizer for that? We should try and ensure they're equivalent if possible. It might also be something that inlining too much is causing as the inliner might be giving up on a huge function that needs a specialized count
the middle.
} | ||
|
||
public func lastRange<D: DataProtocol>(of data: D) -> Range<Index>? { | ||
return self.firstRange(of: data, in: self.startIndex ..< self.endIndex) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be lastRange
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch!
count = range.upperBound | ||
} | ||
Swift.withUnsafeMutableBytes(of: &bytes) { rawBuffer in | ||
bzero(rawBuffer.baseAddress?.advanced(by: range.lowerBound), range.upperBound - range.lowerBound) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bzero
is deprecated and doesn't exist on some platforms (e.g. windows). Use memset(3)
instead.
http://pubs.opengroup.org/onlinepubs/009696699/functions/bzero.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This implementation is only for Darwin platforms so bzero
is fine, but I'll make this change so things are easier to copy over to swift-corelibs-foundation wholesale.
This is an interesting design and the performance improvements are welcome, but I'd nonetheless like to register my disappointment that this major change to the Swift API never went through swift-evolution. |
@karwa The changes here are Foundation API, and the corelibs process is separate from swift-evolution. If we decide at any point to integrate this work into the standard library directly, we'll certainly go through swift-evolution, just like everyone else. |
@airspeedswift Thanks for the review! I'm going to make some of the functional changes here that actually affect the implementation — as you mention, we can do the inlinability fixes in another pass very soon. We'd like to get the implementation landed first for testing and we have a little wiggle room to re-evaluate inlinability. In general, we are aggressively inlining to give the compiler as much room as possible to make performance optimizations, which we've seen are largely necessary to make this work compelling. We can evaluate on a case-by-case basis if we see this having negative effects. |
A word of warning – this can backfire. Marking things inlinable sometimes slows code down unexpectedly too because it limits what assumptions the compiler can make about internal uses of the inlinable function. It also bloats user binary size. |
Please test with following pull request: @swift-ci Please test |
Build failed |
Build failed |
@airspeedswift Do you have any further blocking concerns at this time? If not, we'll need to merge this and apple/swift-lldb#1138 in concert. (I can merge the lldb change but don't have permission to merge this.) |
I'm OK to merge this. Can you add some tests for the parts that had bugs that didn't get test failures in a follow-up PR? |
|
||
// ... and append the rest byte-wise. | ||
while let element = iter.next() { | ||
Swift.withUnsafeBytes(of: element) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a terrible performance pitfall to me... No benchmark currently covers this, but in my local test, this is 32x!!! slower than hitting the happy _copyContents
path.
I see no API on _Representation to work with individual bytes, so this probably needs to be buffered locally here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@palimondo Sorry for the delay in responding — just got back from vacation.
Just to clarify, do you mean by this comment that we should slurp all of the elements out of the iterator into an array, then append to _representation
, or just that hitting this code path is slow? (Because the latter is somewhat expected — we've got a sequence here which is not ContiguousBytes
, doesn't have an implementation of withContiguousStorageIfAvailable
, and doesn't implement the _copyContents
optimization, AKA, the worst-case scenario.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first. I don't think such slowdown is acceptable. Here's benchmark I got:
…
BenchmarkInfo(name: "Data.append.Sequence.ExactCount",
runFunction: { append($0*100,
sequence: repeatElement(UInt8(0xA0), count: 809), to: medium) }, tags: d),
BenchmarkInfo(name: "Data.append.Sequence.UnderestimatedCount",
runFunction: { append($0*100,
sequence: repeatElementSeq(809), to: medium) }, tags: d),
…
let repeatElementSeq = { count in
return sequence(state: count) { (i: inout Int) -> UInt8? in
defer { i = i &- 1 }; return i > 0 ? UInt8(0xA0) : nil
}
}
@inline(never)
func append<S: Sequence>(_ N: Int, sequence: S, to data: Data)
where S.Element == UInt8 {
for _ in 1...N {
var copy = data
copy.append(contentsOf: sequence)
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...and just to clarify, we probably don't need to slurp all of the elements out of the iterator in one go, just amortize the costs of going over the UnsafeBufferPointer
by slurping more elements in batches periodically append to the _representation
. We could try reusing InlineData
as the buffer, or test some moderately sized ContiguousArray
…
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@palimondo Sure, this is a reasonable change we can and should make. Are you interested in taking this on?
(We should make a similar change in append<S : Sequence>(_: S)
as well.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not at the moment. I want to first finish my cleanup pass of the Swift Benchmark Suite, applying the legacyFactor
.
Also, when I have played with this locally (I was println
debugging Data
implementation to see which branch get's exercised), I had trouble getting the build to work properly. Maybe it was just that most of the impl was inlined and the DataBenchmarks were't getting rebuilt? It was a mess.
So I cannot promise I could get back to this sooner than in two weeks...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@palimondo No worries, I'll take care of this, possibly in #21754
Cherry-pick of #20225
(cherry picked from commit d030354)