Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LdapException messages on Unix do not include detail #46021

Closed
danmoseley opened this issue Dec 13, 2020 · 11 comments · Fixed by #47703
Closed

LdapException messages on Unix do not include detail #46021

danmoseley opened this issue Dec 13, 2020 · 11 comments · Fixed by #47703

Comments

@danmoseley
Copy link
Member

danmoseley commented Dec 13, 2020

Right now, many LdapException messages on Unix look like:

System.DirectoryServices.Protocols.LdapException: The LDAP server returned an unknown error.
    at System.DirectoryServices.Protocols.LdapConnection.BeginSendRequest(DirectoryRequest request, TimeSpan requestTimeout, PartialResultProcessing partialMode, AsyncCallback callback, Object state)

There is no way to tell from this message what the error was. For example in the case above say the ErrorCode was -7, then the exception message should include text like The search filter is invalid. . As it is you need to run under the debugger to get it, then look up in ldap.h to translate it. Examples of bugs where the specific message would have helped: #43621 #41617 #36947

Generally these codes are dictated by an RFC that all platforms follow. A subset are represented in our code, across two enums:

  • ResultCode matches LDAP_SUCCESS through LDAP_OTHER in LDAP_RETCODE in Winldap.h (excepting LDAP_IS_LEAF and LDAP_INVALID_CREDENTIALS)
  • LdapError matches LDAP_IS_LEAF, LDAP_INVALID_CREDENTIALS, and LDAP_SERVER_DOWN through LDAP_REFERRAL_LIMIT_EXCEEDED plus LDAP_OPT_REFERRAL_CALLBACK in LDAP_RETCODE in Winldap.h

I went through every one and they all match in name and value between Winldap.h and the Unix ones in ldap.h, except most of ResultCode have different values: LDAP_SERVER_DOWN through LDAP_REFERRAL_LIMIT_EXCEEDED have the values -1 through -17, whereas in Windows they have values 0x51-0x61

(There are various other codes on both sides that aren't represented in the enumerations, but presumably we'll spot those if they ever happen and extend the enum/add messages.)

The fix is simply to extract the LdapError enum into LdapError.Windows.cs and LdapError.Linux.cs with the appropriate values for the platform. We should append the error code in the case that there is no mapping (ie The LDAP server returned an unknown error '{0}'.)

mappings required

For reference, here's the Unix values

#define LDAP_SERVER_DOWN				(-1)
#define LDAP_LOCAL_ERROR				(-2)
#define LDAP_ENCODING_ERROR				(-3)
#define LDAP_DECODING_ERROR				(-4)
#define LDAP_TIMEOUT					(-5)
#define LDAP_AUTH_UNKNOWN				(-6)
#define LDAP_FILTER_ERROR				(-7)
#define LDAP_USER_CANCELLED				(-8)
#define LDAP_PARAM_ERROR				(-9)
#define LDAP_NO_MEMORY					(-10)
#define LDAP_CONNECT_ERROR				(-11)
#define LDAP_NOT_SUPPORTED				(-12)
#define LDAP_CONTROL_NOT_FOUND			(-13)
#define LDAP_NO_RESULTS_RETURNED		(-14)
#define LDAP_MORE_RESULTS_TO_RETURN		(-15)	/* Obsolete */
#define LDAP_CLIENT_LOOP				(-16)
#define LDAP_REFERRAL_LIMIT_EXCEEDED	(-17)

and here's the Windows values

    LDAP_OTHER                      =   0x50,
    LDAP_SERVER_DOWN                =   0x51,
    LDAP_LOCAL_ERROR                =   0x52,
    LDAP_ENCODING_ERROR             =   0x53,
    LDAP_DECODING_ERROR             =   0x54,
    LDAP_TIMEOUT                    =   0x55,
    LDAP_AUTH_UNKNOWN               =   0x56,
    LDAP_FILTER_ERROR               =   0x57,
    LDAP_USER_CANCELLED             =   0x58,
    LDAP_PARAM_ERROR                =   0x59,
    LDAP_NO_MEMORY                  =   0x5a,
    LDAP_CONNECT_ERROR              =   0x5b,
    LDAP_NOT_SUPPORTED              =   0x5c,
    LDAP_NO_RESULTS_RETURNED        =   0x5e,
    LDAP_CONTROL_NOT_FOUND          =   0x5d,
    LDAP_MORE_RESULTS_TO_RETURN     =   0x5f,

    LDAP_CLIENT_LOOP                =   0x60,
    LDAP_REFERRAL_LIMIT_EXCEEDED    =   0x61
@danmoseley
Copy link
Member Author

cc @joperezr

@joperezr
Copy link
Member

Totally agree that our error messages in Linux could and should be much better. Thanks @danmosemsft for the detailed investigation. Without going into too much detail, the issue here is that all search requests in Linux always are handled in two steps by the native library: 1) you send the request to libldap which gives you back an id to later query for the result, and 2) you query the library using that id to see the results and if there was an error with the request, that second operation returns -1 which is were we get the error message from. That said, every time that happens we also call libldap again to get the reason of the failure (which not only gives you the error code from your table above, but actually gives you the error message itself to) we are just not returning this error code and message back to the end user which is what we have to fix. I'll add a pointer here to the line that has the required info to bubble up in case someone wants to try to fix this before I get to it.

@joperezr
Copy link
Member

This is the important line

resultError = ConstructParsedResult(ldapResult, ref serverError, ref responseDn, ref responseMessage, ref responseReferral, ref responseControl);

servererror and responseMessage there have the info we need.

@danmoseley danmoseley added the help wanted [up-for-grabs] Good issue for external contributors label Jan 17, 2021
@danmoseley
Copy link
Member Author

@joperezr hmm, just based on code inspection, I'm not sure I see what work there is beyond fixing the mapping.

  1. After the line you pointed to, serverError seems properly treated, eg turned into an exception here if necessary


    or looks like in System.DirectoryServices.Protocols - Cannot use objectCategory filter on linux #43621 it may be in the original BeginSendRequest here

    or in the case of System.DirectoryServices.Protocols - FastConcurrentBind does not work on linux #41617 in

    ending up in something like
    else if (Utility.IsLdapError((LdapError)error))
    {
    string errorMessage = LdapErrorMappings.MapResultCode(error);
    throw new LdapException(error, errorMessage);

    so I'm fairly confident everything is wired up correctly, it's just that the mapping in LdapError is Windows specific and needs to include the Linux codes which for some reason are different.

  2. There is one other potential issue though, sometimes the ServerErrorMessage is put in the exception too eg

    else if (Utility.IsLdapError((LdapError)error))
    {
    string errorMessage = LdapErrorMappings.MapResultCode(error);
    string serverErrorMessage = SessionOptions.ServerErrorMessage;
    if (!string.IsNullOrEmpty(serverErrorMessage))
    {
    throw new LdapException(error, errorMessage, serverErrorMessage);
    }
    throw new LdapException(error, errorMessage);

    which gets it via
    internal string ServerErrorMessage => GetStringValueHelper(LdapOption.LDAP_OPT_SERVER_ERROR, true);

    and there may be a problem there as we use the Windows defines as follows
    LDAP_OPT_ERROR_NUMBER = 0x31,
    LDAP_OPT_ERROR_STRING = 0x32,
    LDAP_OPT_SERVER_ERROR = 0x33,
    LDAP_OPT_SERVER_EXT_ERROR = 0x34, // Not Supported in Linux

    whereas the Linux headers have

#define LDAP_OPT_RESULT_CODE			0x0031
#define LDAP_OPT_ERROR_NUMBER			LDAP_OPT_RESULT_CODE
#define LDAP_OPT_DIAGNOSTIC_MESSAGE		0x0032
#define LDAP_OPT_ERROR_STRING			LDAP_OPT_DIAGNOSTIC_MESSAGE
#define LDAP_OPT_MATCHED_DN			0x0033
/* 0x0034 - 0x3fff not defined */

so when we request LDAP_OPT_SERVER_ERROR = 0x33 on Unix, it looks like we're actually getting LDAP_OPT_MATCHED_DN ("Sets/gets a string containing the matched DN associated to the LDAP handle") which is probably wrong. On Unix, it looks like we should be requesting LDAP_OPT_DIAGNOSTIC_MESSAGE = 0x32.

@danmoseley
Copy link
Member Author

If so then I'm guessing 90% of fixing this is just setting up the environment...

@danmoseley danmoseley self-assigned this Jan 21, 2021
@danmoseley danmoseley removed untriaged New issue has not been triaged by the area owner help wanted [up-for-grabs] Good issue for external contributors labels Jan 21, 2021
@danmoseley danmoseley added this to the 6.0.0 milestone Jan 21, 2021
@danmoseley
Copy link
Member Author

danmoseley commented Jan 21, 2021

I checked all the other LDAP_OPT_ in the code against the ldap.h and all the others are either present with the same code or we have marked //NotSupportedinLinux. It is just LDAP_OPT_SERVER_ERROR which isn't on Linux but isn't marked as such.

I did find this ancient mention of the discrepancy
https://microsoft.public.adsi.general.narkive.com/Dm3hdaOD/adam-extended-server-error-and-winldap
which suggests that (at least in this case) LDAP_OPT_SERVER_ERROR is the most useful on Windows, not LDAP_OPT_ERROR_STRING/LDAP_OPT_DIAGNOSTIC_MESSAGE. So I think Windows is right, but Unix needs to use the latter.

On Windows we're also not trying LDAP_OPT_SERVER_EXT_ERROR. No idea whether that is useful or not - no plans to try.

@danmoseley
Copy link
Member Author

danmoseley commented Jan 30, 2021

@joperezr @tarekgh the tests for S.DS.P contain instructions for using the OpenDJ server, but I wonder whether that is legacy from the original port to .NET Core, since the tests currently seem to assume an Active Directory LDAP server because they append the domain to the LDAP URL.

It also looks like only 9 of 526 tests hit a server, which will skip unless one is set up, do you know when we last ran those?

It took a while (since I am not familiar with LDAP) but with some test fixes all but one of the tests pass against an OpenDJ docker container. Going forward we should probably test both, and maybe the slapd/Open LDAP server if there is an image for it.

Anyway once I get those all passing, I can look at this and maybe some of the other bugs that may be specific to non-AD LDAP servers.

@joperezr
Copy link
Member

joperezr commented Jan 30, 2021

It is correct that we only have 9 tests that actually hit a server, all the rest are just unit tests that test the interop with the native library, as well as the c# layer we have.

Those 9 tests should theoretically work against any Ldap server (even though they have only ran against Active Directory), and in order to run them, you only need to manually edit the LDAP.Configuration.xml file on the unit tests project. We opted to not run them by default since that would require for us to checkin credentials on that xml file (with the way that things are setup currently), and given the fact that this code is not usually touched, we instead have manually ran them locally every time we do make some change on the code to make sure it works. When we do it, we run against an Active Directory server that we have setup, I can share details with you for connecting with it offline.

As to when was the last time we ran them, I ran them every time I made a change when writing the Linux implementation (~6 months ago) and the code hasn't been touched since, so I would imagine they should be passing just fine today. We should definitely figure out a better way of running these regularly without having to checkin credentials as part of our outer-loop tests, one idea I had but haven't had the chance to implement yet was to set up tests against an actual Ldap server using a docker farm of clients/servers so that we don't have a network dependency and everything is just running inside different docker containers that emulate a network. This is the way that our enterprise Http tests are setup, so we should look into doing something similar for DirectoryServices.

@danmoseley
Copy link
Member Author

Sounds good. If we have a way to run a docker container somehow then automated testing is easy and I would imagine as you say that might be useful for networking tests too. Running manually is ok for a while more but it probably sounds be AD + another. I found your AD instructions and I'll try those.

@joperezr
Copy link
Member

Here are the enterprise setup instructions if you are interested in checking out how the docker farm works, if you follow those steps you should be able to run the tests and we know they are passing since we currently have a build pipeline that runs them. That is what I want to do with DirectoryServices in the future.

@danmoseley
Copy link
Member Author

With respect to the discussion about LDAP_OPT_SERVER_ERROR (which is actually LDAP_OPT_MATCHED_DN for OpenLDAP) and LDAP_OPT_ERROR_STRING above, I wrote various tests to trigger error conditions and ran them on both a Windows client and on a Linux client against OpenDJ, SlapD, and AD servers.

The results - zero output from the former, and almost zero from the latter (only occasionally a string "FilterError"). So the fact that we're calling the wrong thing in some cases - not really an issue.

The place where the good messages are is on directoryResponse.ErrorMessage and that may be worth including in the exception messages.

@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Feb 1, 2021
@ghost ghost added in-pr There is an active PR which will close this issue when it is merged and removed in-pr There is an active PR which will close this issue when it is merged labels Feb 5, 2021
@ghost ghost added in-pr There is an active PR which will close this issue when it is merged and removed in-pr There is an active PR which will close this issue when it is merged labels Feb 12, 2021
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Feb 22, 2021
@ghost ghost locked as resolved and limited conversation to collaborators Mar 24, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
No open projects
Development

Successfully merging a pull request may close this issue.

3 participants