-
Notifications
You must be signed in to change notification settings - Fork 7k
Closed
Closed
Copy link
Labels
P1Issue that should be fixed within a few weeksIssue that should be fixed within a few weeksbugSomething that is supposed to be working; but isn'tSomething that is supposed to be working; but isn'tdataRay Data-related issuesRay Data-related issues
Description
What happened + What you expected to happen
When I specify a partition_col in ray.data.read_parquet, the partition column gets url encoded, which makes sense.
Expected behavior is that when I read it back, the values should be decoded accordingly, but it is not
Versions / Dependencies
Ray version: 2.49.2
Reproduction script
import ray
from pathlib import Path
data = [
{"column_a": "string/with/slashes", "column_b": "hi", "column_c": 1},
{"column_a": "string/with/slashes2", "column_b": "hi2", "column_c": 2},
]
ds = ray.data.from_items(data)
out_dir = Path.cwd() / "test_directory"
ds.write_parquet("file://" + str(out_dir), partition_cols=["column_a"])
data = ray.data.read_parquet("file://" + str(out_dir))
print("this is data: ", data.take(2))
assert (
data.take(1)["column_a"] == "string/with/slashes"
), "String value not decoded properly"std out
this is data: [{'column_b': 'hi', 'column_c': 1, 'column_a': 'string%2Fwith%2Fslashes'}, {'column_b': 'hi', 'column_c': 1, 'column_a': 'string%2Fwith%2Fslashes'}]
...
AssertionError: String value not decoded properly
Issue Severity
Low: It annoys or frustrates me.
Metadata
Metadata
Assignees
Labels
P1Issue that should be fixed within a few weeksIssue that should be fixed within a few weeksbugSomething that is supposed to be working; but isn'tSomething that is supposed to be working; but isn'tdataRay Data-related issuesRay Data-related issues