-
Notifications
You must be signed in to change notification settings - Fork 276
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Raw Container Task Local Execution #2258
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #2258 +/- ##
==========================================
+ Coverage 83.04% 85.69% +2.65%
==========================================
Files 324 20 -304
Lines 24861 1279 -23582
Branches 3547 0 -3547
==========================================
- Hits 20645 1096 -19549
+ Misses 3591 183 -3408
+ Partials 625 0 -625 ☔ View full report in Codecov by Sentry. |
Can we check if docker is installed and then use it please do not add docker dependency |
No problem, will do that. |
cc @eapolinario , @pingsutw Could you please help review? |
cc @wild-endeavor If possible, please take a look too, thank you! |
Question:
|
55549f9
to
e9f9600
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First of all, thanks for you contribution! This will be a massive improvement to the local flytekit use cases.
This is shaping up to be a great PR. I left some comments along the way. But can you make sure to write tests where container tasks are involved in workflows where a combination of regular tasks and container tasks?
assert metadata == "[from python rawcontainer]" | ||
except Exception as e: | ||
# Currently, Ubuntu will pass the test, but MacOS and Windows will not | ||
print(f"Skipping test due to Docker environment setup failure: {e}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Skip the tests using pytest.mark.skipif
, e.g. . You can combine the expression like @pytest.mark.skipif(sys.platform in ["darwin", "win32"], reason="Skip if running on windows or macos")
.
flytekit/core/container_task.py
Outdated
elif cmd.startswith("{{.inputs.") and cmd.endswith("}}"): | ||
return cmd[len("{{.inputs.") : -len("}}")] | ||
return None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's use a regex to handle this. It's a little weird that we're using go templates in python, but since we're not going to do any crazy expressions (it'll be mostly used to parse inputs), something like this will suffice:
regex = r"^\{\{\s*\.inputs\.(.*?)\s*\}\}$"
match = re.match(regex, cmd)
if match:
return match.group(1)
Notice that since those are go templates, there might be whitespaces before and after the argument inside {{
and }}
.
flytekit/core/container_task.py
Outdated
if output_type == bool: | ||
output_dict[k] = output_val == "True" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you write a test case to cover this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No problem, will do it
flytekit/core/container_task.py
Outdated
"sh", | ||
"-c", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not all images have sh
installed, right? Why can't we pass the command directly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice advice, we should pass the command directly.
flytekit/core/container_task.py
Outdated
} | ||
|
||
# Build the command string | ||
commands = "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's make commands
an array and use a join (i.e. " ".join(commands)
) at the end to build the actual command.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice advice, thank you
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
update: will use an array as inputs, this could fix cases we need to split strings.
flytekit/core/container_task.py
Outdated
# Wait for the container to finish the task | ||
container.wait() | ||
container.stop() | ||
container.remove() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can use remove=True
in the invocation of run
instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice advice, thank you
flytekit/core/container_task.py
Outdated
) | ||
# Wait for the container to finish the task | ||
container.wait() | ||
container.stop() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not neede as the previous call (i.e. the call to wait
) only returns after the container stops.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice catch, thank you
Really great advice! |
These are my updates:
Thank you very much! |
Hi, @eapolinario |
Signed-off-by: Future-Outlier <eric901201@gmail.com>
Signed-off-by: Future-Outlier <eric901201@gmail.com>
3bc96d2
to
32546b4
Compare
Signed-off-by: Future-Outlier <eric901201@gmail.com>
Signed-off-by: Future-Outlier <eric901201@gmail.com>
Hi, @eapolinario , I've implemented your suggestions:
Note: Despite If it looks good to you, can you help approve it? |
flytekit/core/container_task.py
Outdated
cmd_and_args += self._args | ||
|
||
for cmd in cmd_and_args: | ||
k = self._get_key_from_cmd(cmd) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could we remove _get_key_from_cmd
here, and add another function render_command
?
def render_command(cmd: str):
for k, v in self.interface.inputs.items():
if type(native_inputs[k]) in [FlyteFile, FlyteDirectory]:
...
else:
cmd = cmd.replace("{{.inputs." + k + "}}", str(native_inputs[k]))
command = [render_command(cmd) for cmd in cmd_and_args]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
amazing advice, thank you.
Signed-off-by: Future-Outlier <eric901201@gmail.com> Co-authored-by: Kevin Su <pingsutw@gmail.com>
Signed-off-by: Future-Outlier <eric901201@gmail.com>
Signed-off-by: Future-Outlier <eric901201@gmail.com>
Signed-off-by: Future-Outlier <eric901201@gmail.com>
Signed-off-by: Future-Outlier <eric901201@gmail.com>
Signed-off-by: Future-Outlier <eric901201@gmail.com>
Signed-off-by: Future-Outlier <eric901201@gmail.com>
Signed-off-by: Future-Outlier <eric901201@gmail.com>
Signed-off-by: Future-Outlier <eric901201@gmail.com>
Signed-off-by: Future-Outlier <eric901201@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So clean. Thank you!
Thank you both too😭😭😭 |
* init Signed-off-by: Future-Outlier <eric901201@gmail.com> * v1 Signed-off-by: Future-Outlier <eric901201@gmail.com> * argurments bug fixed and add log when pulling image Signed-off-by: Future-Outlier <eric901201@gmail.com> * change v to k and handle boolean special case Signed-off-by: Future-Outlier <eric901201@gmail.com> * support blob type and datetime Signed-off-by: Future-Outlier <eric901201@gmail.com> * add unit tests Signed-off-by: Future-Outlier <eric901201@gmail.com> * add exception Signed-off-by: Future-Outlier <eric901201@gmail.com> * nit Signed-off-by: Future-Outlier <eric901201@gmail.com> * fix test Signed-off-by: Future-Outlier <eric901201@gmail.com> * update for flytefile and flytedirectory Signed-off-by: Future-Outlier <eric901201@gmail.com> * support both file paths and template inputs Signed-off-by: Future-Outlier <eric901201@gmail.com> * pytest use sys platform to handle macos and windows case and support regex to parse the input Signed-off-by: Future-Outlier <eric901201@gmail.com> * support datetime.timedelta Signed-off-by: Future-Outlier <eric901201@gmail.com> * lint Signed-off-by: Future-Outlier <eric901201@gmail.com> * add tests and change boolean logic Signed-off-by: Future-Outlier <eric901201@gmail.com> * support Signed-off-by: Future-Outlier <eric901201@gmail.com> * change annotations Signed-off-by: Future-Outlier <eric901201@gmail.com> * add flytefile and flytedir tests Signed-off-by: Future-Outlier <eric901201@gmail.com> * lint Signed-off-by: Future-Outlier <eric901201@gmail.com> * add more tests Signed-off-by: Future-Outlier <eric901201@gmail.com> * lint Signed-off-by: Future-Outlier <eric901201@gmail.com> * change image name Signed-off-by: Future-Outlier <eric901201@gmail.com> * Update pingsu's advice Signed-off-by: Future-Outlier <eric901201@gmail.com> Co-authored-by: Kevin Su <pingsutw@gmail.com> * add docker in dev-requirement Signed-off-by: Future-Outlier <eric901201@gmail.com> * refactor execution Signed-off-by: Future-Outlier <eric901201@gmail.com> * use render pattern Signed-off-by: Future-Outlier <eric901201@gmail.com> * add back container task object in test Signed-off-by: Future-Outlier <eric901201@gmail.com> * refactor output in container task execution Signed-off-by: Future-Outlier <eric901201@gmail.com> * update pingsu's render input advice Signed-off-by: Future-Outlier <eric901201@gmail.com> * update tests Signed-off-by: Future-Outlier <eric901201@gmail.com> * add LiteralMap TypeHints Signed-off-by: Future-Outlier <eric901201@gmail.com> * update dev-req Signed-off-by: Future-Outlier <eric901201@gmail.com> --------- Signed-off-by: Future-Outlier <eric901201@gmail.com> Co-authored-by: Kevin Su <pingsutw@gmail.com> Signed-off-by: Jan Fiedler <jan@union.ai>
Tracking issue
flyteorg/flyte#3876
Why are the changes needed?
We want to support users to run
raw-container
task locally.Note: I didn't support
,datetime.timedelta
List
andDict
.Since I found that copilot has bug to deal with that, so I will support them until those bugs are fixed.
(This implies that we have to fix flytecopilot for nested types and collections types first)
What changes were proposed in this pull request?
local_execute
function inclass ContainerTask(PythonTask)
docker
python library to execute docker command4. I will strip white space because it will be too complicated to support string with white space.(datetime.datetime will be affected)however, in remote cases, flytecopilot will fail, so I think I can support it locally first, and create an issue to support remote cases.direct file paths
andtemplate-style references
as inputs.For example,
{{.inputs.infile}}
andis valid inputs./var/inputs/infile
both areHow was this patch tested?
Reference: https://docs.flyte.org/en/latest/user_guide/customizing_dependencies/raw_containers.html#raw-container
Note:
julia
image has error when using arm64, I've tested it bydocker run -it
into the container and found this error.FlyteFile
task with input and outputFlyteFile
task with output onlyFlyteDirectory
task with input and outputFlyteDirectory
task with output onlyNote: I've provided an image on docker hub, you can switch to this branch and test it directly.
image:
futureoutlier/rawcontainer:0320
Setup process
git clone https://github.com/flyteorg/flytekit gh pr checkout 2258 make setup pip install -e .
python example:
Dockerfile
write_flytefile.py
write_flytedir.py
return_same_value.py
return_flytefile.py
return_flytedir.py
Screenshots
Support documentation example
All examples above
Check all the applicable boxes
Related PRs
#1745