Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] update token retrieval logic #5541

Merged
merged 23 commits into from
Oct 1, 2024

Conversation

not-lain
Copy link
Contributor

@not-lain not-lain commented Sep 25, 2024

Description

support token retrieval from colab secrets
this PR is inspired by huggingface_hub's get_token function
if token exists in secrets it will prompt the user if they want to access it or not
image

#Fixes : #5543

Type of change

  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Refactor (change restructuring the codebase without changing functionality)
  • Improvement (change adding some improvement to an existing functionality)

How Has This Been Tested

Checklist

  • I added relevant documentation
  • I followed the style guidelines of this project
  • I did a self-review of my code
  • I made corresponding changes to the documentation
  • I confirm My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • I have added relevant notes to the CHANGELOG.md file (See https://keepachangelog.com/)

@not-lain not-lain marked this pull request as ready for review September 25, 2024 15:52
@not-lain
Copy link
Contributor Author

here's a minimalistic colab code to try this PR before merging:

!pip install -e "git+https://github.com/not-lain/argilla@update-token-retrieval#egg=argilla&subdirectory=argilla"

restart session then try this :

from argilla._api._token import get_token
get_token()

@not-lain
Copy link
Contributor Author

cc @frascuchon for review

Copy link
Contributor

@burtenshaw burtenshaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. This looks really handy. I added some small comments. Two general things:

  • On naming: Currently, we're referring to our server api credentials as 'credentials' and 'api_key', rather than 'token'. I think this code should use the same naming. There's been confusion between argilla api_key and hf token before.
  • The logic at the moment is to get both or no values (key and url) from userdata and fall back to environ. Do you think there's a use case where the user just wants the key from the userdata?

argilla/src/argilla/_api/_token.py Outdated Show resolved Hide resolved
argilla/src/argilla/_api/_token.py Outdated Show resolved Hide resolved
@burtenshaw burtenshaw changed the title update token retrieval logic [FEATURE] update token retrieval logic Sep 26, 2024
@frascuchon
Copy link
Member

Thanks @not-lain for the contribution. Yesterday I was thinking about this. Since argilla already has the huggingface_hub dependency, why not use the get_token function to configure the token? With this, we keep aligned with other HF packages. What do you think?

@burtenshaw
Copy link
Contributor

not-lain and others added 3 commits September 26, 2024 09:16
Co-authored-by: burtenshaw <ben.burtenshaw@gmail.com>
Co-authored-by: burtenshaw <ben.burtenshaw@gmail.com>
@not-lain
Copy link
Contributor Author

not-lain commented Sep 26, 2024

thanks a lot for the review

On naming: Currently, we're referring to our server api credentials as 'credentials' and 'api_key', rather than 'token'. I think this code should use the same naming. There's been confusion between argilla api_key and hf token before.

thanks for letting me know, will fix this in the next commit.

The logic at the moment is to get both or no values (key and url) from userdata and fall back to environ. Do you think there's a use case where the user just wants the key from the userdata?

I can adapt my implementation to handle this case, I was a bit too skeptical because of the collab thread hanging forever issue (see huggingface/huggingface_hub#1952 ) and I wanted to make something that is robust instead of flexible.
Will try to make it work and commit to this pr once it's ready.

@burtenshaw
Copy link
Contributor

I can adapt my implementation to handle this case, I was a bit too skeptical because of the collab thread hanging forever issue (see huggingface/huggingface_hub#1952 )

@not-lain Understood. I think it's a nice to have anyway, so if it compromises the functionality feel free to ignore the suggestion.

@burtenshaw
Copy link
Contributor

@not-lain @frascuchon We should put this is 2.3.0!

return {"ARGILLA_API_URL": ARGILLA_API_URL, "ARGILLA_API_KEY": ARGILLA_API_KEY}


def _get_token_from_google_colab() -> Optional[str]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can generalize this function to retrieve a secret from colab, an use it multiple times, instead of hardcoding the expected variables:

_get_secret_from_google_colab("ARGILLA_AP_KEY")
_get_secret_from_google_colab("ARGILLA_API_URL")

We could do something similar to the get_token and the _get_token_from_environment functions:

ARGILLA_API_KEY = get_secret("ARGILLA_API_KEY")
ARGILLA_API_URL = get_secret("ARGILLA_API_URL")

@not-lain
Copy link
Contributor Author

hi @frascuchon I have tried your changes which can be tested via the following code (different branch from this one)

!pip install -e "git+https://github.com/not-lain/argilla@test-colab-lock#egg=argilla&subdirectory=argilla"

which produced some weird behavior when running the following code

from argilla._api._token import get_secret
key = get_secret("ARGILLA_API_KEY")
url = get_secret("ARGILLA_API_URL")

initialize your secrets and turn them off
image
then click [ cancel, grant, cancel ]
as you might have noticed 3 tabs popped up which is quite strange and I might attribute this to the shared global lock

        global _IS_GOOGLE_COLAB_CHECKED

not entirely sure of this but this is what I have discovered so far and I wanted to report my findings

@not-lain
Copy link
Contributor Author

Apologies I have been looking at this from a different angle, colab turns out to handle things a slightly bit differently fromthe the normal way I was looking at this.
after careful testing, it turns out that

from argilla._api._token import get_secret
ARGILLA_API_URL = get_secret("ARGILLA_API_URL") 
ARGILLA_API_KEY = get_secret("ARGILLA_API_KEY")

is DIFFERENT from

import argilla as rg
rg.client._api.APIClient() 

I have no clue how and why this happened, but I will report my findings to colab and let them figure this out.

I will now proceed to merge the changes that @frascuchon has recommended

not-lain and others added 11 commits September 29, 2024 04:39
Co-authored-by: Paco Aranda <francisco.aranda@huggingface.co>
Co-authored-by: Paco Aranda <francisco.aranda@huggingface.co>
Co-authored-by: Paco Aranda <francisco.aranda@huggingface.co>
Co-authored-by: Paco Aranda <francisco.aranda@huggingface.co>
Co-authored-by: Paco Aranda <francisco.aranda@huggingface.co>
Co-authored-by: Paco Aranda <francisco.aranda@huggingface.co>
Co-authored-by: Paco Aranda <francisco.aranda@huggingface.co>
Co-authored-by: Paco Aranda <francisco.aranda@huggingface.co>
@not-lain
Copy link
Contributor Author

This PR should be ready for review, you can test this out by using the following code

  1. install
!pip install -e "git+https://github.com/not-lain/argilla@update-token-retrieval#egg=argilla&subdirectory=argilla"
  1. restart session
  2. run the following code
import argilla as rg
rg.client._api.APIClient()

@frascuchon
Copy link
Member

Thanks @not-lain for taking into account my comments, and apologize for the inconvenience. So, finally, did you manage the errors applying my changes? Does the env loading work as expected? The current code looks nice!

@frascuchon frascuchon added this to the v2.3.0 milestone Sep 30, 2024
@not-lain
Copy link
Contributor Author

Hi @frascuchon, yes I can indeed confirm that everything is working perfectly.

Copy link
Member

@frascuchon frascuchon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @not-lain !

@frascuchon
Copy link
Member

@burtenshaw Does it make sense to add it somewhere in the docs?

@burtenshaw
Copy link
Contributor

@burtenshaw Does it make sense to add it somewhere in the docs?

@frascuchon Yeah. I would just add a section to client API reference that mentions support for colab userdata. But let's merge this to develop, and I can open a docs PR.

@frascuchon frascuchon merged commit 3032d3a into argilla-io:develop Oct 1, 2024
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

automatically load token from collab secrets if it exists
3 participants