Skip to content

Conversation

@richardliaw
Copy link
Contributor

Shows users how to use download to download from URI tables.

Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
@richardliaw richardliaw requested a review from a team as a code owner December 13, 2025 00:30
@richardliaw richardliaw added docs An issue or change related to documentation go add ONLY when ready to merge, run all tests labels Dec 13, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces documentation for the download expression in Ray Data, which is a valuable addition. The examples are clear and helpful for users looking to download data from URIs within their datasets. I've identified a couple of minor areas for improvement in the code examples to enhance clarity by removing unused imports. The other changes in this PR, which correct file paths, are accurate and necessary.

richardliaw and others added 2 commits December 15, 2025 10:18
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
@bveeramani bveeramani enabled auto-merge (squash) December 15, 2025 18:52
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
@github-actions github-actions bot disabled auto-merge December 15, 2025 21:04
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>

NUM_GPU_NODES = 8
INPUT_PATH = "s3://anonymous@ray-example-data/imagenet/metadata_file"
INPUT_PATH = "s3://anonymous@ray-example-data/imagenet/metadata_file.parquet"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Inconsistent path update between benchmark comparison scripts

The INPUT_PATH in ray_data_main.py was updated to include .parquet, but the companion benchmark file daft_main.py in the same directory still uses the old path s3://anonymous@ray-example-data/imagenet/metadata_file without the extension. These two files are meant to compare Ray Data vs Daft performance on the same image classification workload, so they need to read from the same data path. This inconsistency will cause either one benchmark to fail (if only one path exists) or the benchmarks to read from different datasets, making comparisons invalid.

Fix in Cursor Fix in Web

@richardliaw richardliaw merged commit e0049dc into ray-project:master Dec 16, 2025
6 checks passed
kriyanshii pushed a commit to kriyanshii/ray that referenced this pull request Dec 16, 2025
…-project#59417)

Shows users how to use `download` to download from URI tables.

---------

Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Signed-off-by: kriyanshii <kriyanshishah06@gmail.com>
cszhu pushed a commit that referenced this pull request Dec 17, 2025
)

Shows users how to use `download` to download from URI tables.

---------

Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Yicheng-Lu-llll pushed a commit to Yicheng-Lu-llll/ray that referenced this pull request Dec 22, 2025
…-project#59417)

Shows users how to use `download` to download from URI tables.

---------

Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

docs An issue or change related to documentation go add ONLY when ready to merge, run all tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants