Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with auth bearer token validation (unreleased) #177

Closed
gregd33 opened this issue Dec 13, 2022 · 9 comments · Fixed by #178
Closed

Issue with auth bearer token validation (unreleased) #177

gregd33 opened this issue Dec 13, 2022 · 9 comments · Fixed by #178

Comments

@gregd33
Copy link

gregd33 commented Dec 13, 2022

I was looking into oauth-related things and noticed there were some recent commits that haven't been released that break our workflow.

In our use case, we are using an oauth2-proxy that sits outside of weaviate. That is, weaviate itself knows nothing about authentication and it happens at the nginx ingress layer.

We had previously would get a token and use it like this:

client = weaviate.Client(URL, additional_headers= {"Authorization": f"Bearer {token['access_token']}"})

It seems that this now breaks. I think it is due to this line:
https://github.com/semi-technologies/weaviate-python-client/blob/main/weaviate/connect/connection.py#L94

Since we aren't using weaviate's auth, the validation here will fail.

Is it possible to remove this validation or make it optional?

@dirkkul
Copy link
Collaborator

dirkkul commented Dec 13, 2022

Hi,
thanks for your input, can you double check that this fixes your setup?

#178

Thanks!

@gregd33
Copy link
Author

gregd33 commented Dec 13, 2022

Thank you for the quick response.! It did not fix the issue but it did lead me to figure out the core issue and a workaround.

The real issue is here:
https://github.com/semi-technologies/weaviate-python-client/blob/main/weaviate/connect/connection.py#L119-L131

In our deployment, we have nginx.ingress.kubernetes.io/auth-signin enabled so that when a user goes to any URL they are redirected to the oauth login page. The problem is that this means a requests.get gives a 200 response as it just returns the login page. So the code above thinks we have weaviate open-id configured - both in the current main code and with your branch.

I think in order for it to work, the check would have to go a bit further before assuming we are in the weaviate/openid state. A working, if inelegant, example is below.

That said, removing the nginx.ingress.kubernetes.io/auth-signin annotation solves the problem so that is a workaround at least (at least for us as we only need programmatic access). This workaround works with the code currently in main so this branch wouldn't be needed.

That said, I believe something like this would be a potential solution by validating the response is JSON. There is probably a better way to validate the response though.

        resp = None
        if response.status_code == 200:
            import simplejson
            try:
                resp = response.json() #resp is not None iff we get a 200 response with valid json back
            except simplejson.JSONDecodeError:
                pass

        if resp is not None and auth_client_secret is None:
                raise AuthenticationFailedException(
                f""""No login credentials provided. The weaviate instance at {self.url} requires login credential,
                use argument 'auth_client_secret'."""
            )
        elif resp is not None:
            _auth = _Auth(resp, auth_client_secret, self)
            self._session = _auth.get_auth_session()

            if isinstance(auth_client_secret, AuthClientCredentials):
                # credentials should only be saved for client credentials, otherwise use refresh token
                self._create_background_token_refresh(_auth)
            else:
                self._create_background_token_refresh()
        elif response.status_code == 404 and auth_client_secret is not None:
            _Warnings.auth_with_anon_weaviate()
            self._session = requests.Session()
        else:
            self._session = requests.Session()

@dirkkul
Copy link
Collaborator

dirkkul commented Dec 13, 2022

alright interesting :)
I updated the branch and the client now uses a session without authentication if it cannot parse the weaviate oidc response. Can you give it another try?

@gregd33
Copy link
Author

gregd33 commented Dec 13, 2022

Yes, that works thanks!

However, one complication: there is a known issue with json.JSONDecodeError failing when simplejson is also in the environment- that's why I used simplejson in my example. Your solution works fine when simplejson isn't present but breaks when it is.

psf/requests#4842

Maybe something like this is needed?

try:
    from simplejson import JSONDecodeError
except:
    from json import JSONDecodeError

@gregd33
Copy link
Author

gregd33 commented Dec 13, 2022

I will mention a possible regression though: if I try to give a URL without the headers, I previously got:

weaviate.Client("https://xxxx/")

ValueError: No login credentials provided. The weaviate instance at https://xxxx/requires login credential, use argument 'auth_client_secret'.

I now just get a JSON decode error as this isn't caught. I'm fine with that but I thought I would mention it

@dirkkul
Copy link
Collaborator

dirkkul commented Dec 14, 2022

I will mention a possible regression though: if I try to give a URL without the headers, I previously got:

I think that was an artifact of your proxy. Before it would just check if the OIDC config returns a 200 status code to determine if you need authentication, now it also checks if it can decode it. So I think this is working as intended

@dirkkul
Copy link
Collaborator

dirkkul commented Dec 14, 2022

However, one complication: there is a known issue with json.JSONDecodeError failing when simplejson is also in the environment- that's why I used simplejson in my example. Your solution works fine when simplejson isn't present but breaks when it is.

Thanks for letting me know, what an annoying design of the requests library. I added your codesnippet

@gregd33
Copy link
Author

gregd33 commented Dec 14, 2022

Yes I agree that it was an artifact of our proxy.

Thanks so much for responding to this quickly!

@dirkkul
Copy link
Collaborator

dirkkul commented Dec 15, 2022

You're welcome! There was a tiny change during review with the Json_Exception, would be great if you could double check, but I tested with simplejson installed and without

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants