Skip to content

Conversation

@sampan-s-nayak
Copy link
Contributor

@sampan-s-nayak sampan-s-nayak commented Oct 17, 2025

Description

builds atop of #58047, this pr ensures the following when auth_mode is token:
calling ray.init() (without passing an existing cluster address) -> check if token is present, generate and store in default path if not present
calling ray.init(address="xyz") (connecting to an existing cluster) -> check if token is present, raise exception if one is not present

sampan added 13 commits October 16, 2025 08:35
Signed-off-by: sampan <sampan@anyscale.com>
- Created RayAuthTokenLoader singleton class with thread-safe token caching
- Loads tokens from RAY_AUTH_TOKEN env, RAY_AUTH_TOKEN_PATH, or ~/.ray/auth_token
- Support for token generation with UUID (cross-platform)
- Modified GrpcServer to store and pass auth token to ServerCallImpl
- Updated RPC_SERVICE_HANDLER macros to pass auth token
- GCS server now loads token using RayAuthTokenLoader
- Removed auth_token from RayConfig (now loaded via loader)
- Token precedence: env var -> path env var -> default file path

Signed-off-by: sampan <sampan@anyscale.com>
- Created Python auth_token_loader module with thread-safe token caching
- Loads tokens from same precedence as C++: RAY_AUTH_TOKEN, RAY_AUTH_TOKEN_PATH, ~/.ray/auth_token
- Added enable_token_auth parameter to ray.init() with auto-generation support
- Added --enable-token-auth flag to ray start CLI (fails if no token found)
- Only pass enable_token_auth flag via system_config, not the token
- Each side (C++/Python) loads tokens independently using their own loaders
- ray.init() auto-generates token if not found, ray start fails with helpful error

Signed-off-by: sampan <sampan@anyscale.com>
- Test token loading from RAY_AUTH_TOKEN environment variable
- Test token loading from RAY_AUTH_TOKEN_PATH file
- Test token loading from default ~/.ray/auth_token path
- Test precedence order (env var > path env var > default file)
- Test token generation with GetToken(true)
- Test token caching behavior
- Test thread safety with concurrent GetToken calls
- Test whitespace trimming from token files
- Test behavior when no token is found

Signed-off-by: sampan <sampan@anyscale.com>
- Test token loading from RAY_AUTH_TOKEN environment variable
- Test token loading from RAY_AUTH_TOKEN_PATH file
- Test token loading from default ~/.ray/auth_token path
- Test precedence order (env var > path env var > default file)
- Test token generation with generate_if_not_found=True
- Test token caching behavior across multiple calls
- Test has_auth_token() function
- Test thread safety with concurrent loads and generation
- Test whitespace handling and empty values
- Test file permissions on Unix systems (0600)
- Test error handling for permission errors
- Test integration with fixtures and cleanup

Signed-off-by: sampan <sampan@anyscale.com>
Signed-off-by: sampan <sampan@anyscale.com>
Signed-off-by: sampan <sampan@anyscale.com>
Signed-off-by: sampan <sampan@anyscale.com>
Signed-off-by: sampan <sampan@anyscale.com>
Signed-off-by: sampan <sampan@anyscale.com>
Signed-off-by: sampan <sampan@anyscale.com>
Signed-off-by: sampan <sampan@anyscale.com>
Signed-off-by: sampan <sampan@anyscale.com>
@sampan-s-nayak sampan-s-nayak changed the title [Core] Authentication for ray core rpc calls - part 1 [Core] Authentication for ray core rpc calls - part 2 Oct 17, 2025
sampan and others added 15 commits October 17, 2025 07:51
Signed-off-by: sampan <sampan@anyscale.com>
Signed-off-by: sampan <sampan@anyscale.com>
Signed-off-by: sampan <sampan@anyscale.com>
Signed-off-by: sampan <sampan@anyscale.com>
Signed-off-by: sampan <sampan@anyscale.com>
Signed-off-by: sampan <sampan@anyscale.com>
Signed-off-by: sampan <sampan@anyscale.com>
Signed-off-by: sampan <sampan@anyscale.com>
Signed-off-by: sampan <sampan@anyscale.com>
Signed-off-by: sampan <sampan@anyscale.com>
Signed-off-by: sampan <sampan@anyscale.com>
Signed-off-by: sampan <sampan@anyscale.com>
Signed-off-by: sampan <sampan@anyscale.com>
Signed-off-by: sampan <sampan@anyscale.com>
sampan and others added 5 commits October 27, 2025 02:59
Signed-off-by: sampan <sampan@anyscale.com>
Signed-off-by: sampan <sampan@anyscale.com>
Signed-off-by: sampan <sampan@anyscale.com>
Signed-off-by: sampan <sampan@anyscale.com>
@sampan-s-nayak sampan-s-nayak marked this pull request as ready for review October 27, 2025 06:29
@sampan-s-nayak sampan-s-nayak requested a review from a team as a code owner October 27, 2025 06:29
cursor[bot]

This comment was marked as outdated.

@ray-gardener ray-gardener bot added the core Issues that should be addressed in Ray Core label Oct 27, 2025
@sampan-s-nayak
Copy link
Contributor Author

@edoakes addressed comments

raise


def _get_default_token_path() -> Path:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that works for me

Comment on lines 66 to 67
is_new_cluster: Set to True if you're starting a new local cluster, or False if you're connecting
to an existing cluster.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an abstraction leak and also not quite correct because you might be called ray start --head, which is a new cluster, but it should error. Let's just call it what it is: generate_token_if_not_found

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in that case let me name this validate_token_exists(bool generate_if_not_exists=False) then depending on the various cases caller can set either generate_if_not_exists=True or False

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

chat gpt suggested ensure_token_if_auth_enabled which i feel is even clearer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

main purpose is to verify that token is present (generate a new one in certain cases) and fail early if not present. we also validate that enabling auth mode ray config is through env and not through system config

Comment on lines +22 to +31
env_vars_to_clean = [
"RAY_AUTH_TOKEN",
"RAY_AUTH_TOKEN_PATH",
"RAY_auth_mode",
]
original_values = {}
for var in env_vars_to_clean:
original_values[var] = os.environ.get(var)
if var in os.environ:
del os.environ[var]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use monkeypatch env vars here instead of rewriting it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this causes issues as the env vars are also used by the ray cluster

Base automatically changed from token_auth_2 to master October 28, 2025 00:33
cursor[bot]

This comment was marked as outdated.

sampan and others added 2 commits October 28, 2025 08:50
Signed-off-by: sampan <sampan@anyscale.com>
@edoakes
Copy link
Collaborator

edoakes commented Oct 28, 2025

test failures

sampan added 3 commits October 28, 2025 15:54
Signed-off-by: sampan <sampan@anyscale.com>
Signed-off-by: sampan <sampan@anyscale.com>
Signed-off-by: sampan <sampan@anyscale.com>
@edoakes edoakes merged commit c0f3ee6 into master Oct 30, 2025
6 checks passed
@edoakes edoakes deleted the grpc_auth_2 branch October 30, 2025 13:54
YoussefEssDS pushed a commit to YoussefEssDS/ray that referenced this pull request Nov 8, 2025
…ct#57835)

## Description
builds atop of ray-project#58047, this pr
ensures the following when `auth_mode` is `token`:
calling `ray.init() `(without passing an existing cluster address) ->
check if token is present, generate and store in default path if not
present
calling `ray.init(address="xyz")` (connecting to an existing cluster) ->
check if token is present, raise exception if one is not present

---------

Signed-off-by: sampan <sampan@anyscale.com>
Signed-off-by: Sampan S Nayak <sampansnayak2@gmail.com>
Co-authored-by: sampan <sampan@anyscale.com>
Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
landscapepainter pushed a commit to landscapepainter/ray that referenced this pull request Nov 17, 2025
…ct#57835)

## Description
builds atop of ray-project#58047, this pr
ensures the following when `auth_mode` is `token`:
calling `ray.init() `(without passing an existing cluster address) ->
check if token is present, generate and store in default path if not
present
calling `ray.init(address="xyz")` (connecting to an existing cluster) ->
check if token is present, raise exception if one is not present

---------

Signed-off-by: sampan <sampan@anyscale.com>
Signed-off-by: Sampan S Nayak <sampansnayak2@gmail.com>
Co-authored-by: sampan <sampan@anyscale.com>
Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
Aydin-ab pushed a commit to Aydin-ab/ray-aydin that referenced this pull request Nov 19, 2025
…ct#57835)

## Description
builds atop of ray-project#58047, this pr
ensures the following when `auth_mode` is `token`:
calling `ray.init() `(without passing an existing cluster address) ->
check if token is present, generate and store in default path if not
present
calling `ray.init(address="xyz")` (connecting to an existing cluster) ->
check if token is present, raise exception if one is not present

---------

Signed-off-by: sampan <sampan@anyscale.com>
Signed-off-by: Sampan S Nayak <sampansnayak2@gmail.com>
Co-authored-by: sampan <sampan@anyscale.com>
Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
Signed-off-by: Aydin Abiar <aydin@anyscale.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Issues that should be addressed in Ray Core go add ONLY when ready to merge, run all tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants