Skip to content

Commit

Permalink
Optimize remote commands through StreamFlowPath
Browse files Browse the repository at this point in the history
This commit relies on the new `StreamFlowPath` abstraction to redirect
file-based commands to the lowest possible `ExecutionLocation` in the
wrapping hierarchy, in order to meet a `local` location whenever
possible. The main benefit of this strategy is that `local` locations
support Python-based commands, which are way faster than shell-based
remote processes.
  • Loading branch information
GlassOfWhiskey committed Dec 15, 2024
1 parent f83c0cb commit be41a10
Show file tree
Hide file tree
Showing 3 changed files with 278 additions and 201 deletions.
13 changes: 1 addition & 12 deletions streamflow/data/manager.py
Original file line number Diff line number Diff line change
Expand Up @@ -137,7 +137,7 @@ def get(
[
loc
for loc in locations
if not (data_type and loc.data_type != data_type)
if not (data_type is not None and loc.data_type != data_type)
]
)
return result
Expand Down Expand Up @@ -337,17 +337,6 @@ async def transfer_data(
)
)
# Follow symlink for source path
await asyncio.gather(
*(
asyncio.create_task(src_data_loc.available.wait())
for src_data_loc in self.get_data_locations(
path=src_path,
deployment=src_connector.deployment_name,
location_name=src_location.name,
data_type=DataType.PRIMARY,
)
)
)
if (
src_realpath := await StreamFlowPath(
src_path, context=self.context, location=src_location
Expand Down
Loading

0 comments on commit be41a10

Please sign in to comment.