-
Notifications
You must be signed in to change notification settings - Fork 7k
[Core] handling auth_mode=token in python ray.init() calls #57835
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: sampan <sampan@anyscale.com>
- Created RayAuthTokenLoader singleton class with thread-safe token caching - Loads tokens from RAY_AUTH_TOKEN env, RAY_AUTH_TOKEN_PATH, or ~/.ray/auth_token - Support for token generation with UUID (cross-platform) - Modified GrpcServer to store and pass auth token to ServerCallImpl - Updated RPC_SERVICE_HANDLER macros to pass auth token - GCS server now loads token using RayAuthTokenLoader - Removed auth_token from RayConfig (now loaded via loader) - Token precedence: env var -> path env var -> default file path Signed-off-by: sampan <sampan@anyscale.com>
- Created Python auth_token_loader module with thread-safe token caching - Loads tokens from same precedence as C++: RAY_AUTH_TOKEN, RAY_AUTH_TOKEN_PATH, ~/.ray/auth_token - Added enable_token_auth parameter to ray.init() with auto-generation support - Added --enable-token-auth flag to ray start CLI (fails if no token found) - Only pass enable_token_auth flag via system_config, not the token - Each side (C++/Python) loads tokens independently using their own loaders - ray.init() auto-generates token if not found, ray start fails with helpful error Signed-off-by: sampan <sampan@anyscale.com>
- Test token loading from RAY_AUTH_TOKEN environment variable - Test token loading from RAY_AUTH_TOKEN_PATH file - Test token loading from default ~/.ray/auth_token path - Test precedence order (env var > path env var > default file) - Test token generation with GetToken(true) - Test token caching behavior - Test thread safety with concurrent GetToken calls - Test whitespace trimming from token files - Test behavior when no token is found Signed-off-by: sampan <sampan@anyscale.com>
- Test token loading from RAY_AUTH_TOKEN environment variable - Test token loading from RAY_AUTH_TOKEN_PATH file - Test token loading from default ~/.ray/auth_token path - Test precedence order (env var > path env var > default file) - Test token generation with generate_if_not_found=True - Test token caching behavior across multiple calls - Test has_auth_token() function - Test thread safety with concurrent loads and generation - Test whitespace handling and empty values - Test file permissions on Unix systems (0600) - Test error handling for permission errors - Test integration with fixtures and cleanup Signed-off-by: sampan <sampan@anyscale.com>
Signed-off-by: sampan <sampan@anyscale.com>
Signed-off-by: sampan <sampan@anyscale.com>
Signed-off-by: sampan <sampan@anyscale.com>
Signed-off-by: sampan <sampan@anyscale.com>
Signed-off-by: sampan <sampan@anyscale.com>
Signed-off-by: sampan <sampan@anyscale.com>
Signed-off-by: sampan <sampan@anyscale.com>
c804ed5 to
7bde811
Compare
Signed-off-by: sampan <sampan@anyscale.com>
Signed-off-by: sampan <sampan@anyscale.com>
Signed-off-by: sampan <sampan@anyscale.com>
Signed-off-by: sampan <sampan@anyscale.com>
Signed-off-by: sampan <sampan@anyscale.com>
Signed-off-by: sampan <sampan@anyscale.com>
Signed-off-by: sampan <sampan@anyscale.com>
Signed-off-by: sampan <sampan@anyscale.com>
Signed-off-by: sampan <sampan@anyscale.com>
Signed-off-by: sampan <sampan@anyscale.com>
Signed-off-by: sampan <sampan@anyscale.com>
|
@edoakes addressed comments |
| raise | ||
|
|
||
|
|
||
| def _get_default_token_path() -> Path: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that works for me
| is_new_cluster: Set to True if you're starting a new local cluster, or False if you're connecting | ||
| to an existing cluster. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an abstraction leak and also not quite correct because you might be called ray start --head, which is a new cluster, but it should error. Let's just call it what it is: generate_token_if_not_found
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in that case let me name this validate_token_exists(bool generate_if_not_exists=False) then depending on the various cases caller can set either generate_if_not_exists=True or False
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
chat gpt suggested ensure_token_if_auth_enabled which i feel is even clearer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
main purpose is to verify that token is present (generate a new one in certain cases) and fail early if not present. we also validate that enabling auth mode ray config is through env and not through system config
| env_vars_to_clean = [ | ||
| "RAY_AUTH_TOKEN", | ||
| "RAY_AUTH_TOKEN_PATH", | ||
| "RAY_auth_mode", | ||
| ] | ||
| original_values = {} | ||
| for var in env_vars_to_clean: | ||
| original_values[var] = os.environ.get(var) | ||
| if var in os.environ: | ||
| del os.environ[var] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use monkeypatch env vars here instead of rewriting it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this causes issues as the env vars are also used by the ray cluster
|
test failures |
Signed-off-by: sampan <sampan@anyscale.com>
Signed-off-by: sampan <sampan@anyscale.com>
…ct#57835) ## Description builds atop of ray-project#58047, this pr ensures the following when `auth_mode` is `token`: calling `ray.init() `(without passing an existing cluster address) -> check if token is present, generate and store in default path if not present calling `ray.init(address="xyz")` (connecting to an existing cluster) -> check if token is present, raise exception if one is not present --------- Signed-off-by: sampan <sampan@anyscale.com> Signed-off-by: Sampan S Nayak <sampansnayak2@gmail.com> Co-authored-by: sampan <sampan@anyscale.com> Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
…ct#57835) ## Description builds atop of ray-project#58047, this pr ensures the following when `auth_mode` is `token`: calling `ray.init() `(without passing an existing cluster address) -> check if token is present, generate and store in default path if not present calling `ray.init(address="xyz")` (connecting to an existing cluster) -> check if token is present, raise exception if one is not present --------- Signed-off-by: sampan <sampan@anyscale.com> Signed-off-by: Sampan S Nayak <sampansnayak2@gmail.com> Co-authored-by: sampan <sampan@anyscale.com> Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
…ct#57835) ## Description builds atop of ray-project#58047, this pr ensures the following when `auth_mode` is `token`: calling `ray.init() `(without passing an existing cluster address) -> check if token is present, generate and store in default path if not present calling `ray.init(address="xyz")` (connecting to an existing cluster) -> check if token is present, raise exception if one is not present --------- Signed-off-by: sampan <sampan@anyscale.com> Signed-off-by: Sampan S Nayak <sampansnayak2@gmail.com> Co-authored-by: sampan <sampan@anyscale.com> Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com> Signed-off-by: Aydin Abiar <aydin@anyscale.com>
Description
builds atop of #58047, this pr ensures the following when
auth_modeistoken:calling
ray.init()(without passing an existing cluster address) -> check if token is present, generate and store in default path if not presentcalling
ray.init(address="xyz")(connecting to an existing cluster) -> check if token is present, raise exception if one is not present