Skip to content

Commit

Permalink
[Data] Deprecate read_parquet_bulk (ray-project#48691)
Browse files Browse the repository at this point in the history
Users (including Ray Data developers!) are often confused about how to
choose between `read_parquet` and `read_parquet_bulk`. To avoid
confusion, this PR deprecates `read_parquet_bulk`.

Signed-off-by: Balaji Veeramani <bveeramani@berkeley.edu>
Signed-off-by: hjiang <dentinyhao@gmail.com>
  • Loading branch information
bveeramani authored and dentiny committed Dec 7, 2024
1 parent 97152e1 commit 8023878
Showing 1 changed file with 8 additions and 2 deletions.
10 changes: 8 additions & 2 deletions python/ray/data/read_api.py
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,7 @@
from ray.data.datasource.parquet_meta_provider import ParquetMetadataProvider
from ray.data.datasource.partitioning import Partitioning
from ray.types import ObjectRef
from ray.util.annotations import DeveloperAPI, PublicAPI
from ray.util.annotations import Deprecated, DeveloperAPI, PublicAPI
from ray.util.scheduling_strategies import NodeAffinitySchedulingStrategy

if TYPE_CHECKING:
Expand Down Expand Up @@ -923,7 +923,7 @@ class string
)


@PublicAPI
@Deprecated
def read_parquet_bulk(
paths: Union[str, List[str]],
*,
Expand Down Expand Up @@ -1023,6 +1023,12 @@ def read_parquet_bulk(
Returns:
:class:`~ray.data.Dataset` producing records read from the specified paths.
"""
warnings.warn(
"`read_parquet_bulk` is deprecated and will be removed after May 2025. Use "
"`read_parquet` instead.",
DeprecationWarning,
)

if meta_provider is None:
meta_provider = FastFileMetadataProvider()
read_table_args = _resolve_parquet_args(
Expand Down

0 comments on commit 8023878

Please sign in to comment.