-
Notifications
You must be signed in to change notification settings - Fork 652
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
make enums containing ByteBuffer fit in three words #349
Conversation
Sources/NIO/ByteBuffer-core.swift
Outdated
@@ -29,6 +29,86 @@ let sysFree: @convention(c) (UnsafeMutableRawPointer?) -> Void = free | |||
} | |||
#endif | |||
|
|||
struct SadUInt24: ExpressibleByIntegerLiteral { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haha... Sad...
;)
Sources/NIO/ByteBuffer-core.swift
Outdated
} | ||
|
||
init?(_ value: UInt32) { | ||
guard value & 0xff_00_00_00 == 0 else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TIL that you can use _
to make it more readable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great so far... Just some random comments
Sources/NIOHTTP1/HTTPTypes.swift
Outdated
@@ -19,11 +19,52 @@ let headerSeparator: StaticString = ": " | |||
|
|||
/// A representation of the request line and header fields of a HTTP request. | |||
public struct HTTPRequestHead: Equatable { | |||
private class _Storage { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it not really makes a difference but use final
(for all internal classes) ?
Sources/NIOHTTP1/HTTPTypes.swift
Outdated
} | ||
|
||
func copy() -> _Storage { | ||
return _Storage(method: self.method, rawURI: self.rawURI, version: self.version) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you either always use .init
or always _Storage
just to make it consistent ?
b35aa27
to
25fae62
Compare
Sources/NIO/ByteBuffer-core.swift
Outdated
|
||
guard sliceBeginIndex <= ByteBuffer.Slice.maxSupportedLowerBound else { | ||
// the slice's begin is past the maximum supported slice begin value (16 MiB) so the only option we have | ||
// is copy the slice into a fresh buffer and the slice begin will then be at index 0. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: s/and the/. The/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we pull the changes to the HTTP types to a separate PR? It'll make reviewing the changes quite a bit easier. 😄
Sources/NIO/ByteBuffer-core.swift
Outdated
init(_ value: UInt32) { | ||
precondition(value & 0xff_00_00_00 == 0, "value \(value) too large for _UInt24") | ||
self.b12 = UInt16(truncatingIfNeeded: (value & 0xff_ff) >> 0 ) | ||
self.b3 = UInt8(truncatingIfNeeded: (value & 0xff_ff_00) >> 16) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can save yourself a masking operation here. You already precondition that the top byte is zero, so with that knowledge in place you can just shift right by 16 bits and store into UInt8
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also as noted in the chat, the mask isn't really right anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops! thanks
Sources/NIO/ByteBuffer-core.swift
Outdated
@@ -29,6 +29,97 @@ let sysFree: @convention(c) (UnsafeMutableRawPointer?) -> Void = free | |||
} | |||
#endif | |||
|
|||
@_versioned | |||
struct _UInt24: ExpressibleByIntegerLiteral { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we get a doc comment here please?
Sources/NIO/ByteBuffer-core.swift
Outdated
} | ||
|
||
@_versioned | ||
struct _ByteBufferSlice { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly, can we get a doc comment here please?
Sources/NIO/ByteBuffer-core.swift
Outdated
@@ -486,8 +577,19 @@ public struct ByteBuffer { | |||
} | |||
let index = _toIndex(index) | |||
let length = _toCapacity(length) | |||
let sliceBeginIndex = self._slice.lowerBound + index | |||
|
|||
guard sliceBeginIndex <= ByteBuffer.Slice.maxSupportedLowerBound else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: the word "begin" here is odd. Start might be better.
Sources/NIO/FileRegion.swift
Outdated
@@ -12,6 +12,68 @@ | |||
// | |||
//===----------------------------------------------------------------------===// | |||
|
|||
struct _UInt56: ExpressibleByIntegerLiteral { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be nice to consolidate our helper integers into a single source file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fair enough, will do that
Sources/NIO/FileRegion.swift
Outdated
precondition(value & 0xff_00_00_00_00_00_00_00 == 0) | ||
self.init(b1234: UInt32(truncatingIfNeeded: (value & 0xff_ff_ff_ff) >> 0 ), | ||
b56: UInt16(truncatingIfNeeded: (value & 0xff_ff_00_00_00_00) >> 32), | ||
b7: UInt8(truncatingIfNeeded: (value & 0xff_00_00_00_00_00_00) >> 48)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given your earlier precondition, this mask is unnecessary.
d7f91d4
to
1f36f2d
Compare
Sources/NIO/FileRegion.swift
Outdated
|
||
extension _UInt56 { | ||
init(_ value: UInt64) { | ||
precondition(value & 0xff_00_00_00_00_00_00_00 == 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should all of these precondition
be assert
as this is not really something the user can trigger but only ourselves if our code has a bug.
Sources/NIO/IntegerTypes.swift
Outdated
|
||
extension _UInt24 { | ||
init(_ value: UInt32) { | ||
precondition(value & 0xff_00_00_00 == 0, "value \(value) too large for _UInt24") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@weissi should this be an assert as there is no way for the user to actual construct this ? I think we usually use assert
for things that would only happen because of a bug in NIO itself
Sources/NIO/IntegerTypes.swift
Outdated
|
||
extension _UInt56 { | ||
init(_ value: UInt64) { | ||
precondition(value & 0xff_00_00_00_00_00_00_00 == 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@weissi should this be an assert as there is no way for the user to actual construct this ? I think we usually use assert
for things that would only happen because of a bug in NIO itself
Also consider adding a message as well
a445186
to
6c91e27
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One minor nit: feel free to either address it or not depending on how you feel about it.
Tests/NIOTests/ByteBufferTest.swift
Outdated
|
||
let offset = Int(_UInt24.max) + 1 | ||
let expectedReadablebytes = totalBufferSize - offset | ||
let slice = buf.getSlice(at: offset, length: expectedReadablebytes)! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor nit: it might be better if we sliced the same size as the test above, to isolate the test differences to only being the slice start index, rather than the slice sizes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed and improved test (by marking some 'important' positions with different byte values)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM after @Lukasa's nit is addressed... Great work!
Motivation: NIO's dynamic pipeline comes with one performance caveat: If your type isn't specialised in NIOAny (only types in the NIO module can be) and it's over 3 words (24 bytes on 64-bit platforms), you'll suffer an allocation for every time you box it in NIOAny. Very unfortunately, both ByteBuffer and FileRegion were exactly 3 words wide which means that any enum containing those would be wider than 3 words. Worse, both HTTP{Request,Response}Head contained types that were wider than 3 words. The best solution to this problem is to shrink ByteBuffer and FileRegion to just under 3 words and that's exactly what this PR is doing. That is slightly tricky as ByteBuffer was already bit packed fairly well (reader/writer indices & slice begin/end were stored as a UInt32). The trick we employ now is to store the slice beginning in a UInt24 and the file region reader index in a UInt56. That saves one byte for both ByteBuffer and FileRegion with very moderate tradeoffs: The reader index in a file now needs to be within 64 PiB (peta bytes) and a byte buffer slice beginning must start within 16 MiB (mega bytes). Note: The reader/writer indices as well as slice ends are _not_ affected and can still be within 4 GiB. Clearly no one would care about the restrictions for FileRegions in the real world but we might hit the ByteBuffer slice beginning limit in which case the slice would be copied out. But given that slices are mostly used to slice off headers in network protocols, 16 MiB should be _plenty_. Norman was kind enough to measure the perf differences: master (before this PR): ``` $ wrk -c 256 -d 10s -t 4 -s ~/Downloads/pipeline-many.lua -H "X-Host: SomeValue" -H "Host: swiftnio.io" -H "ThereAreEvenMoreHeaders: AndMoreValues" http://localhost:8888 Running 10s test @ http://localhost:8888 4 threads and 256 connections Thread Stats Avg Stdev Max +/- Stdev Latency 93.02ms 158.13ms 1.97s 93.07% Req/Sec 112.79k 26.05k 258.75k 84.50% 4501098 requests in 10.04s, 214.63MB read Socket errors: connect 0, read 111, write 0, timeout 89 Requests/sec: 448485.61 Transfer/sec: 21.39MB ``` after this PR: ``` $ wrk -c 256 -d 10s -t 4 -s ~/Downloads/pipeline-many.lua -H "X-Host: SomeValue" -H "Host: swiftnio.io" -H "ThereAreEvenMoreHeaders: AndMoreValues" http://localhost:8888 Running 10s test @ http://localhost:8888 4 threads and 256 connections Thread Stats Avg Stdev Max +/- Stdev Latency 107.53ms 206.56ms 1.99s 90.55% Req/Sec 124.15k 26.56k 290.41k 89.25% 4952904 requests in 10.03s, 236.17MB read Socket errors: connect 0, read 161, write 0, timeout 22 Requests/sec: 493852.65 Transfer/sec: 23.55MB ``` so we see a nice 10% improvement Modifications: - shrank ByteBuffer by 1 byte, making it 23 bytes in total - shrank FileRegion by 1 byte, making it 23 bytes in total - added @_inlineable to NIOAny where appropriate to not suffer boxed existentials in different places - added tests for the new edge cases Result: - more speed - fewer allocations
@weissi please merge once CI completes |
CC @tanner0101 & @helje5
only really makes sense together with #351
Motivation:
NIO's dynamic pipeline comes with one performance caveat: If your type
isn't specialised in NIOAny (only types in the NIO module can be) and
it's over 3 words (24 bytes on 64-bit platforms), you'll suffer an
allocation for every time you box it in NIOAny. Very unfortunately, both
ByteBuffer and FileRegion were exactly 3 words wide which means that any
enum containing those would be wider than 3 words. Worse, both
HTTP{Request,Response}Head contained types that were wider than 3 words.
The best solution to this problem is to shrink ByteBuffer and FileRegion
to just under 3 words and that's exactly what this PR is doing. That is
slightly tricky as ByteBuffer was already bit packed fairly well
(reader/writer indices & slice begin/end were stored as a UInt32). The
trick we employ now is to store the slice beginning in a UInt24 and the
file region reader index in a UInt56. That saves one byte for both
ByteBuffer and FileRegion with very moderate tradeoffs: The reader index
in a file now needs to be within 64 PiB (peta bytes) and a byte buffer
slice beginning must start within 16 MiB (mega bytes). Note: The
reader/writer indices as well as slice ends are not affected and can
still be within 4 GiB. Clearly no one would care about the restrictions
for FileRegions in the real world but we might hit the ByteBuffer slice
beginning limit in which case the slice would be copied out. But
given that slices are mostly used to slice off headers in network
protocols, 16 MiB should be plenty.
Norman was kind enough to measure the perf differences:
master (before this PR):
after this PR:
so we see a nice 10% improvement and allocations go down by 2, too:
from
Modifications:
Result: