-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switch DirectoryControl to use AsnWriter, AsnDecoder #101512
base: main
Are you sure you want to change the base?
Conversation
Replaced this with the managed AsnDecoder, removing PInvoke from a potential hot path.
Also removed the manual API calls to ldap_create_sort_control - this is now built in managed code. This then has knock-on effects to eliminate the SortKeyInterop classes.
Most of the Control tests were hardcoded to the output of BerConverter, which uses four-byte lengths in all cases. This behaviour is now different: the same output is returned across all platforms for .NET, and remains unchanged for .NET Framework. This should also close issue 34679.
Reduce number of copies required in TransformControls, and enable these copies to take advantage of newer intrinsics where available.
Windows domain controllers may return a distinguished name starting with OU=, rather than ou=.
Out of curiosity, any benchmarks for perf. numbers before/after switching to |
I've not got benchmarks right now, but will write some in the next few days. In advance of these, I expect there'll be a modest reduction in managed and unmanaged memory usage, and that execution time will reduce (while remaining within the margin of error for the network request itself.) |
Preallocating space for AsnWriter buffers to reduce memory usage. Correctly handling attribute names in SortControls.
Benchmarks are below. To summarize:
Performance header
AsqRequestControl.GetValue: -90% execution time, -35% Gen0 memory allocation
CrossDomainMoveControl.GetValue: -54% execution time, -62% Gen0 memory allocation
DirSyncRequestControl.GetValue: -89% execution time, -35% Gen0 memory allocation
ExtendedDNControl.GetValue: -89% execution time, -30% Gen0 memory allocation
PageResultRequestControl.GetValue: -90% execution time, -40% Gen0 memory allocation
QuotaControl.GetValue: -92% execution time, -28% Gen0 memory allocation
SearchOptionsControl.GetValue: -90% execution time, -30% Gen0 memory allocation
SecurityDescriptorFlagControl.GetValue: -88% execution time, -30% Gen0 memory allocation
SortRequestControl.GetValue: -79% execution time, +4% Gen0 memory allocation
DirectoryControl.TransformControls: -75% execution time, -76% Gen0 memory allocation
VerifyNameControl.GetValue: -87% execution time, -46% Gen0 memory allocation
VlvRequestControl.GetValue: -84% execution time, -69% Gen0 memory allocation
I've made a performance adjustment by specifying the initial size of AsnWriter, since this is trivial to calculate (or always static.) AsnWriter grows in 1KB increments, which is much larger than the size of a normal directory control and causes memory usage to balloon. One inefficiency which I couldn't eliminate is that when writing strings as ASN.1 octet strings, I want to manually select the encoding to use and encode directly into the AsnWriter buffer. This isn't possible, (probably to keep AsnWriter specification-compliant) so I have to reserve/allocate a byte array, encode into that and write that out as an octet string. An example of this behaviour is in Edit: the updated build has completed and the test failures are unrelated, so I'm now happy that the benchmarks are valid @PaulusParssinen |
...irectoryServices.Protocols/src/System/DirectoryServices/Protocols/common/DirectoryControl.cs
Outdated
Show resolved
Hide resolved
...irectoryServices.Protocols/src/System/DirectoryServices/Protocols/common/DirectoryControl.cs
Outdated
Show resolved
Hide resolved
I've run the usual LDAP tests against a Windows domain controller, and all have passed as expected (including I'm happy with the changes made to pool AsnWriters. With the inlining hint on |
{ | ||
// The control is a PageControl. | ||
object[] result = BerConverter.Decode("{iO}", value); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While looking to see if there are any interesting edge cases popping up from this, I found a case where we're increasing the range of supported values, in a surprising way.
The DER sequence 30 05 02 01 03 04 00
is a SEQUENCE(INTEGER(3), OCTET_STRING(''))
. Under BER (and CER) it can (or, for CER, must) also be encoded with an indefinite length: 30 80 02 01 03 04 00 00 00
(change the length of the sequence from 0x05 to 0x80 and add two 0x00 values on the end)
I was incredibly surprised to learn that BerConverter.Decode
doesn't seem to support indefinite-length encoding.
If it's important to continue NOT supporting indefinite-length encoding, that will require special handling.
The main thing I was trying to test for is "does the existing support trailing data", since the new code seems to. And it looks like yes (as measured from the perspective of BerDecoder).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's caught me by surprise too, thanks for pointing it out. BerConverter.Decode
's lack of support for indefinite-length encoding is consistent between Windows and Linux - they may have interpreted section 5.1 of RFC4511, which states that LDAP shall be encoded for exchange using BER and only use the definite form of length encoding, in order to ease the encoding/decoding overhead. Since their ber_scanf
interpretation knows it'll never need to deal with indefinite length encoding, it's just not supported.
I don't think it precludes us from using an ASN.1 parser which supports it. Anyone who sends this is breaching the protocol specification. Section 4.1.1 notes:
If the server receives an LDAPMessage from the client in which the
LDAPMessage SEQUENCE tag cannot be recognized, the messageID cannot
be parsed, the tag of the protocolOp is not recognized as a request,
or the encoding structures or lengths of data fields are found to be
incorrect, then the server SHOULD return the Notice of Disconnection
described in Section 4.4.1, with the resultCode set to protocolError,
and MUST immediately terminate the LDAP session as described in
Section 5.3.In other cases where the client or server cannot parse an LDAP PDU,
it SHOULD abruptly terminate the LDAP session (Section 5.3) where
further communication (including providing notice) would be
pernicious. Otherwise, server implementations MUST return an
appropriate response to the request, with the resultCode set to
protocolError.
As I read it: if the client receives an LDAP PDU containing an indefinite-length BER-encoded value from the server but can still parse this PDU without any errors, the LDAP specification doesn't explicitly define the expected client-side action.
I've read the intent of the RFC's restriction as being solely to reduce encoding/decoding overhead, so unless you'd prefer to flag the server's breach of RFC4511 with an exception, I'm comfortable with DirectoryControl.TransformControls
silently parsing any indefinite-length BER value properly and proceeding as it currently does.
...irectoryServices.Protocols/src/System/DirectoryServices/Protocols/common/DirectoryControl.cs
Outdated
Show resolved
Hide resolved
...irectoryServices.Protocols/src/System/DirectoryServices/Protocols/common/DirectoryControl.cs
Outdated
Show resolved
Hide resolved
cookie = AsnDecoder.ReadOctetString(asnSpan, AsnEncodingRules.BER, out bytesConsumed); | ||
asnSpan = asnSpan.Slice(bytesConsumed); | ||
} | ||
ThrowUnless(asnSpan.IsEmpty); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a fan of this check (it matches what the RFC says), but it's not something that BerConverter.Decode does, so it's introducing an exception where there wasn't one previously.
That's not forbidden, but there should be a test for it (maybe there already is) and it needs to be part of a breaking change doc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There doesn't seem to be any self-contained test coverage of the controls decoded in this method; the closest it could be said to come is the processing which takes place when somebody tests against a real LDAP server. I'll add this coverage, then broaden it to explicitly test trailing data in SEQUENCEs.
...irectoryServices.Protocols/src/System/DirectoryServices/Protocols/common/DirectoryControl.cs
Outdated
Show resolved
Hide resolved
...irectoryServices.Protocols/src/System/DirectoryServices/Protocols/common/DirectoryControl.cs
Outdated
Show resolved
Hide resolved
...irectoryServices.Protocols/src/System/DirectoryServices/Protocols/common/DirectoryControl.cs
Show resolved
Hide resolved
* Ensured that all references to DirectoryControl names in comments are accurate. * Specified the name of the ASN.1 structure which is being parsed in the comments. * Added explanatory comment stating that BER INTEGER values are being interpreted as 32-bit signed integers.
} | ||
asnSpan = asnSpan.Slice(bytesConsumed); | ||
|
||
attribute = s_utf8Encoding.GetString(attributeNameBuffer); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is using the throw-on-invalid form of the UTF-8 decoder there should be a catch for DecoderFallbackException (probably turning it into a BerConversionException)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is also a behavioral change which is virtuous but probably belongs in a breaking change doc.
...irectoryServices.Protocols/src/System/DirectoryServices/Protocols/common/DirectoryControl.cs
Outdated
Show resolved
Hide resolved
...irectoryServices.Protocols/src/System/DirectoryServices/Protocols/common/DirectoryControl.cs
Outdated
Show resolved
Hide resolved
...irectoryServices.Protocols/src/System/DirectoryServices/Protocols/common/DirectoryControl.cs
Outdated
Show resolved
Hide resolved
...irectoryServices.Protocols/src/System/DirectoryServices/Protocols/common/DirectoryControl.cs
Outdated
Show resolved
Hide resolved
...irectoryServices.Protocols/src/System/DirectoryServices/Protocols/common/DirectoryControl.cs
Outdated
Show resolved
Hide resolved
...irectoryServices.Protocols/src/System/DirectoryServices/Protocols/common/DirectoryControl.cs
Outdated
Show resolved
Hide resolved
{ | ||
UTF8Encoding encoder = new UTF8Encoding(); | ||
_target = encoder.GetBytes(target); | ||
_target = s_utf8Encoding.GetBytes(target); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change makes invalid inputs (such as "\uDD74\uD800"
) throw an EncoderFallbackException
, which it used to not do.
I'm always a fan of rejecting nonsense inputs; but it's an observable change so it needs to have a test, and documented as a breaking change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@buyaa-n At this point I have completed my review of the technical impacts (that I noticed) of changing from BerConverter to AsnDecoder; with one outstanding issue regarding the encoding of SortResult/SortResponseControl's attribute value (https://github.com/dotnet/runtime/pull/101512/files?w=1#r1690564822).
I strongly oppose the current PR's efforts to do manual bookkeeping while trying to maintain multiple AsnWriter instances tiered by the size of their internal buffers; but it's really your call, not mine.
The test changes where patterns like 84 00 00 00 12
turn into 12
are all correct, those are both valid encodings (under BER) for the length 18 (0x12).
I am not personally confident that there's sufficient test coverage for this library to call this a "low risk" change. I haven't done a thorough investigation of the coverage, only looked at what is in this diff.
The DirectoryControl.cs
file really ought to be broken up into one-class-per-file... but definitely not concurrent with changes.
I am neither granting a checkmark nor mandating changes required. I defer those decisions to the area owners.
* Refactored LDAP string writing logic into a separate WriteLdapString method in AsnWriterExtensions. * Replaced separate AsnWriter instances with a single one (reduced memory usage when compared to multiple instances being active, even with varying buffer sizes.) * Resolved backwards-compatibility issue when attempting to process an attribute name which contains invalid UTF-8 characters. * Correcting expected ASN tag when parsing an attribute name in a sort result.
I've resolved the outstanding issue with SortResponseControl, and addressed bartonjs' feedback by removing the tiered pooled AsnWriters.
This comes at roughly the same point as I write the tests for the five DirectoryControls being modified, and their behavioural changes. Instead of including all of these tests in this PR, I'll extract the majority of those tests (including cases where BerConverter's parsing is lax/nonconformant to standard) into their own PR as a baseline against the existing method. I think that having this baseline should handle the direct risk, and we can come back to the current PR after the tests are verified passing on main. |
@edwardneal this is what we wanted to have before merging/validating this PR. Glad to hear that you are planning to do it, now the main branch opened for |
Relates to #97540.
This PR replaces all references to
BerConverter
in LDAP directory control generation/parsing to use AsnWriter and AsnDecoder. SortRequestControl didn't use BerConverter directly - it called the OpenLDAP and WLDAPldap_create_sort_control
APIs instead. This class was the only thing in S.DS.P which referenced theldap_create_sort_control
andSortKeyInterop
struct, so I deleted them both.SortRequestControl
The change to SortRequestControl's generation mechanism might also resolve #34679, since there shouldn't be any mechanism for the heap corruption to occur.
Most of the SortRequestControl's new ASN.1 encoding is pretty uncontroversial, but there was a bit of discussion in PR #65548 around the encoding of the sort key's attribute name, and this was marshalled (as part of SortKeyInterop) with different encodings between Windows and Linux. In the RFC, this is defined (indirectly) as an LdapString; this is described as ISO10646 characters, encoded as a UTF-8 string and represented as an OCTET STRING. I'm fairly sure that UTF8Encoding.GetBytes fulfils this, and running the associated test case against a real AD domain controller passes.
Test changes
There are also test changes, but these are largely to change the special-casing of expected byte values between OpenLDAP and WLDAP - .NET now generates these values in a consistent format (the OpenLDAP format) regardless of platform. The .NET Framework tests continued to use the version of S.DS.P from the GAC in my environment, so I've special-cased by the framework version rather than by the platform.
Misc. optimizations
There were a handful of byte-by-byte array copies, which I've switched over to using span-based copies in hopes that they'll benefit slightly from vectorisation. TransformControls and GetValue have a related change: where they used to reference properties returning byte arrays (which took defensive copies) they now reference the property values directly. These should both reduce GC traffic slightly.