Skip to content

Commit

Permalink
Move convert-string into component setup method (#871)
Browse files Browse the repository at this point in the history
Dask config needs to be set before starting the `LocalCluster`. I first
put it into the `DaskDataIO` since it's central to the working of
Fondant that this is set correctly, and I didn't want this to be
overwritten. That's too late in our flow though. The other place we
could put it is in the executor, but we currently try to keep that
Dask-agnostic.
  • Loading branch information
RobbeSneyders authored Feb 21, 2024
1 parent 5a4a64c commit e74e765
Show file tree
Hide file tree
Showing 2 changed files with 3 additions and 5 deletions.
3 changes: 3 additions & 0 deletions src/fondant/component/component.py
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,9 @@ def __init__(self, **kwargs):
super().__init__()

def setup(self) -> t.Any:
# Don't assume every object is a string
# https://docs.dask.org/en/stable/changelog.html#v2023-7-1
dask.config.set({"dataframe.convert-string": False})
# worker.daemon is set to false because creating a worker process in daemon
# mode is not possible in our docker container setup.
dask.config.set({"distributed.worker.daemon": False})
Expand Down
5 changes: 0 additions & 5 deletions src/fondant/component/data_io.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,6 @@
import typing as t
from collections import defaultdict

import dask
import dask.dataframe as dd
from dask.diagnostics import ProgressBar
from dask.distributed import Client
Expand Down Expand Up @@ -31,10 +30,6 @@ def __init__(
input_partition_rows: t.Optional[int] = None,
):
super().__init__(manifest=manifest, operation_spec=operation_spec)
# Don't assume every object is a string
# https://docs.dask.org/en/stable/changelog.html#v2023-7-1
dask.config.set({"dataframe.convert-string": False})

self.input_partition_rows = input_partition_rows

def partition_loaded_dataframe(self, dataframe: dd.DataFrame) -> dd.DataFrame:
Expand Down

0 comments on commit e74e765

Please sign in to comment.