From 09cf5bba9e8f2e3af4336f0b2b76fb4727b09214 Mon Sep 17 00:00:00 2001 From: Scott Lee Date: Thu, 17 Oct 2024 18:27:38 -0700 Subject: [PATCH] [Data] Fix logging output from `write_xxx` APIs (#48096) ## Why are these changes needed? Followup to https://github.com/ray-project/ray/pull/47942. Fixes the formatting of the output logged after write operation finishes, which incorrectly printed out the entire `dataclasses.field` object. Previous logging: ``` 2024-10-17 17:26:48,396 INFO datasink.py:103 -- Write operation succeeded. Aggregated write results: Field(name='num_rows',type=,default=0,default_factory=,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),_field_type=_FIELD): 100 Field(name='size_bytes',type=,default=0,default_factory=,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),_field_type=_FIELD): 800 ``` Fixed logging: ``` 2024-10-17 17:30:52,491 INFO datasink.py:103 -- Write operation succeeded. Aggregated write results: - num_rows: 100 - size_bytes: 800 ``` ## Related issue number ## Checks - [x] I've signed off every commit(by using the -s flag, i.e., `git commit -s`) in this PR. - [x] I've run `scripts/format.sh` to lint the changes in this PR. - [ ] I've included any doc changes needed for https://docs.ray.io/en/master/. - [ ] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in `doc/source/tune/api/` under the corresponding `.rst` file. - [x] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ - Testing Strategy - [x] Unit tests - [ ] Release tests - [ ] This PR is not tested :( Signed-off-by: Scott Lee --- python/ray/data/datasource/datasink.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/python/ray/data/datasource/datasink.py b/python/ray/data/datasource/datasink.py index 0832e0539fd1..fe4d4cf4ef9a 100644 --- a/python/ray/data/datasource/datasink.py +++ b/python/ray/data/datasource/datasink.py @@ -98,7 +98,7 @@ def on_write_complete(self, write_result_blocks: List[Block]) -> WriteResult: aggregated_results_str = "" for k in fields(aggregated_write_results.__class__): v = getattr(aggregated_write_results, k.name) - aggregated_results_str += f"\t{k}: {v}\n" + aggregated_results_str += f"\t- {k.name}: {v}\n" logger.info( f"Write operation succeeded. Aggregated write results:\n"