Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve/fix glue job logs printing #30886

Merged
merged 6 commits into from
Apr 28, 2023
Merged

improve/fix glue job logs printing #30886

merged 6 commits into from
Apr 28, 2023

Conversation

vandonr-amz
Copy link
Contributor

there was several problems with the current implementation:

  • it was only fetching 1 response from the logs, so if the job was producing a lot of logs, it'd lag further and further back in time (and possibly never catch up)
  • if the job was marked as failed, it was jumping from one stream to an other without warning, and started filtering on exceptions and errors, providing a very weird experience to someone who'd use those to debug
  • upon changing log streams, the continuation token was not reset, so that behavior was not working anyway (you cannot use a continuation token from a stream when reading from another)
  • upon checking the different log streams available (/output and /error), I realized that it was not clear cut at all. A stack trace could go in /output, and some INFO logs were in /error, so I think it has value to have both streams if one wants to understand what's happening.

Regarding the fact that we now display both streams, I hesitated between interleaving messages from both, sorting by timestamp, or leaving them separated. I ended up choosing to have them separated to keep the experience consistent with what the user would see in cloudwatch, but I'd be happy to change that to chronological order if people think it's better.

Also: added it to the system test (for better visibility for users) + added utest

cc @ferruzzi

Copy link
Contributor

@ferruzzi ferruzzi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Glue logging is a mess, I like this solution. Left a couple comments.

airflow/providers/amazon/aws/hooks/glue.py Outdated Show resolved Hide resolved
airflow/providers/amazon/aws/hooks/glue.py Show resolved Hide resolved
Copy link
Contributor

@ferruzzi ferruzzi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@potiuk potiuk merged commit 1f01749 into apache:main Apr 28, 2023
@IAL32
Copy link
Contributor

IAL32 commented May 23, 2023

Even if this is closed, I would like to make a note on Glue logging.
I realized these issues as well, so I created a helper method to create a logger specifically for Glue jobs, which works for both glueetl and pythonshell jobs:

DEFAULT_LOG_FORMAT = "%(levelname)s:%(name)s:%(module)s:%(message)s"
def get_logger(name: str = None, level: Any = logging.INFO, log_format: str = DEFAULT_LOG_FORMAT) -> logging.Logger:
    """Returns a logger configured for Glue jobs"""
    formatter = logging.Formatter(fmt=log_format)

    # glue sets its own handlers by default, but they suck.

    # this handler redirects INFO, WARNING and DEBUG to sys.stdout
    stdout_handler = logging.StreamHandler(sys.stdout)
    stdout_handler.setLevel(logging.DEBUG)
    stdout_handler.addFilter(lambda record: record.levelno < logging.ERROR)
    stdout_handler.setFormatter(formatter)

    # this handler redirects ERROR to sys.stderr
    stderr_handler = logging.StreamHandler(sys.stderr)
    stderr_handler.setLevel(logging.ERROR)
    stderr_handler.setFormatter(formatter)

    logger = logging.getLogger(name=name)
    logger.handlers.clear()
    logger.setLevel(level)
    logger.addHandler(stdout_handler)
    logger.addHandler(stderr_handler)
    return logger

In effect, this will log all INFO, WARNING and DEBUG to /output and all ERROR to /error.
The solution proposed by this PR still helps if one does not want to have either boilerplate code in their jobs or install internal dependencies.

@vandonr-amz vandonr-amz deleted the vandonr/op branch May 23, 2023 16:26
@ferruzzi
Copy link
Contributor

ferruzzi commented May 23, 2023

@IAL32 Neat. Where does that helper live? Is it added to the script that the job executes?

@IAL32
Copy link
Contributor

IAL32 commented May 23, 2023

@IAL32 Is it get added to the script that the job executes?

Exactly. As far as I know, this is the only way to get logging to work on Glue Jobs.

Also note: when grabbing the root logger (name=None) this also has the effect that all other calls to log will in fact use the handlers defined in the helper method, which I think is neat.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants