Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Operation output to S3 based on env var config #103

Merged
merged 2 commits into from
Jan 23, 2025

Conversation

caponetto
Copy link

@caponetto caponetto commented Jan 20, 2025

Move towards a more similar codebase between this fork and upstream.

Discarding these changes made on ODH only:

In favor of these changes made on upstream:

Please note that the upstream version does not include log boundaries like

logger.info("----------------------Python logs start----------------------")
logger.info("----------------------Python logs ends----------------------")

Validation

Run a pipeline with the following Python script:

a = 10
b = 0
c = a/b

The pipeline will fail and the "divide by zero" error will be logged on the pod logs:

Screenshot 2025-01-20 at 13 03 52

On the other hand, if you run a pipeline with the following Python script:

try:
    a = 10
    b = 0
    c = a/b
except ZeroDivisionError:
    print("Cannot divide by zero!")

The pipeline will succeed and the catch message will be logged on the pod logs:

[I 16:14:51.351] Processing file: untitled.py
[I 16:14:51.362] Output: Cannot divide by zero!
[I 16:14:51.363] Return code: 0

RFileOp and PythonFileOp log stdout and stderr output to stdout
so it appears in system logs, catch non zero return/exit code in exception,
no logging of output to file in S3 COS if env var
ELYRA_GENERIC_NODES_ENABLE_SCRIPT_OUTPUT_TO_S3 is set to false

Signed-off-by: shalberd <21118431+shalberd@users.noreply.github.com>
Signed-off-by: shalberd <21118431+shalberd@users.noreply.github.com>
@codecov-commenter
Copy link

codecov-commenter commented Jan 20, 2025

Codecov Report

Attention: Patch coverage is 26.08696% with 34 lines in your changes missing coverage. Please review.

Project coverage is 80.52%. Comparing base (95a1a6f) to head (886d92d).

Files with missing lines Patch % Lines
elyra/kfp/bootstrapper.py 20.93% 34 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #103      +/-   ##
==========================================
- Coverage   80.60%   80.52%   -0.09%     
==========================================
  Files         151      151              
  Lines       19401    19424      +23     
  Branches      477      482       +5     
==========================================
+ Hits        15639    15641       +2     
- Misses       3584     3601      +17     
- Partials      178      182       +4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@caponetto caponetto requested a review from harshad16 January 20, 2025 16:43
@shalberd
Copy link

shalberd commented Jan 22, 2025

Set default value for ELYRA_GENERIC_NODES_ENABLE_SCRIPT_OUTPUT_TO_S3 to false to keep the current behavior. In a future step, we can set this env var on the notebook level.

Yeah, what I did in our organization: I set this env var to false when building the runtime container image, in the Dockerfile / Containerfile

@harshad16
Copy link
Member

harshad16 commented Jan 22, 2025

/lgtm

Tested on cluster as well:

  1. with exception handling in the code:
    Screenshot from 2025-01-22 16-14-16
  2. Without exception handling in the code:
    image

Let's mark the following, so we can note this in some documentation when we upgrade our notebooks with these upgrade, :

Please note that the upstream version does not include log boundaries like

logger.info("----------------------Python logs start----------------------")
logger.info("----------------------Python logs ends----------------------")

as previously, we included this,
let's document that we removed it, so users if any have designed any automation can adjust it.

Additionally, the env vars: ELYRA_GENERIC_NODES_ENABLE_SCRIPT_OUTPUT_TO_S3 , we can keep it as upstream, and explicitly mark them as false in notebook images.
https://github.com/opendatahub-io/notebooks/blob/4839f12bafd57ee0c81d1e7e402f758c010f297b/jupyter/datascience/ubi9-python-3.11/setup-elyra.sh#L20

@caponetto caponetto force-pushed the cp-upstream-issue3222 branch from 886d92d to f2f16b3 Compare January 23, 2025 11:25
@caponetto
Copy link
Author

caponetto commented Jan 23, 2025

Thanks for the review @harshad16! I removed the last commit that set the env var to false. Please take a final look.
I'll create a JIRA for post-ODH-Elyra release so we don't forget to do the remaining tasks you've mentioned.

EDIT: Tracking here https://issues.redhat.com/browse/RHOAIENG-18582

Copy link
Member

@harshad16 harshad16 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

thanks 💯

@caponetto caponetto merged commit 56832ee into opendatahub-io:main Jan 23, 2025
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants