Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change Deferrable implementation for RedshiftPauseClusterOperator to follow standard #30853

Merged

Conversation

syedahsn
Copy link
Contributor

@syedahsn syedahsn commented Apr 25, 2023

Now that #30032 is merged, we can update the deferrable implementation to use custom aiobotocore waiters , as well as async_conn to simplify deferrable implmentation, and reduce code duplication. The method used here is described in the README.md included in the initial PR.
@pankajastro @phanikumv


^ Add meaningful description above

Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

Add async custom waiter support in get_waiter, and base_waiter.py
Add Deferrable mode to RedshiftCreateClusterOperator
Add RedshiftCreateClusterTrigger and unit test
Add README.md for writing Triggers for AMPP
Fix tests to work with new code
@boring-cyborg boring-cyborg bot added area:providers provider:amazon-aws AWS/Amazon - related issues labels Apr 25, 2023
Copy link
Member

@pankajastro pankajastro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @syedahsn for the cleanup. Is there a way to log some messages when we are waiting using the waiter, otherwise sometimes I as a user feel not sure what going on by looking airflow task log

@syedahsn
Copy link
Contributor Author

Thanks for the review! Yes the botocore waiters don't log any messages as they are polling the AWS service. Because this is implemented in the botocore code, in order to do this, we'd have to make the changes there. This is something that we've considered, and may end up implementing - unless the botocore team has a reason for not doing this by default. I imagine others have raised a similar concern,

@phanikumv
Copy link
Contributor

Thanks for the review! Yes the botocore waiters don't log any messages as they are polling the AWS service. Because this is implemented in the botocore code, in order to do this, we'd have to make the changes there. This is something that we've considered, and may end up implementing - unless the botocore team has a reason for not doing this by default. I imagine others have raised a similar concern,

This will be a problem for our current users who are seeing log messages on the Triggerer vs no logs post this PR.

@pankajastro
Copy link
Member

Thanks for the review! Yes the botocore waiters don't log any messages as they are polling the AWS service. Because this is implemented in the botocore code, in order to do this, we'd have to make the changes there. This is something that we've considered, and may end up implementing - unless the botocore team has a reason for not doing this by default. I imagine others have raised a similar concern,

This will be a problem for our current users who are seeing log messages on the Triggerer vs no logs post this PR.

I agree with @phanikumv logs are important here.

@potiuk
Copy link
Member

potiuk commented Apr 27, 2023

FYI. Airflow 2.6 implements interleaving Tirgger and Task logs together. So for those users it will be no problem, they will se trigger logs coming.

@pankajastro
Copy link
Member

FYI. Airflow 2.6 implements interleaving Tirgger and Task logs together. So for those users it will be no problem, they will se trigger logs coming.

that's correct. But sometimes when the operator waits for a long time in that case if we do some periodic logging like current state sleeping time etc that helps the user especially when they have limited access and feel like things are ok but if this is missing (In this case since no log in trigger run method) they might not be sure what going on in the background. It would have been nice if we could have achieved the same along with the waiter.

@potiuk
Copy link
Member

potiuk commented Apr 28, 2023

FYI. Airflow 2.6 implements interleaving Tirgger and Task logs together. So for those users it will be no problem, they will se trigger logs coming.

that's correct. But sometimes when the operator waits for a long time in that case if we do some periodic logging like current state sleeping time etc that helps the user especially when they have limited access and feel like things are ok but if this is missing (In this case since no log in trigger run method) they might not be sure what going on in the background. It would have been nice if we could have achieved the same along with the waiter.

Quite agree. My point there was that even if there is no support for providing such waiter feedback/callback in the botocore, the trigger could simply slice the time and wait in (say 30 seconds) intervals, handle the timout every time it finishes and emit log entry and come back to waiting. With the new interleaving mechanism, that would end up with a log entry every 30 s.

That would not require changes in the botocore @syedahsn @pankajastro @phanikumv :) (as far as I understand where the discussion stalled).

Add unit tests for deferrable waiters
@syedahsn
Copy link
Contributor Author

syedahsn commented May 3, 2023

Thanks @potiuk for the idea about handling the error from the waiter to log a status message! I've been playing around that for a bit, and I have a solution that produces the following logs for this operator:

[2023-05-03, 22:53:07 UTC] {base.py:73} INFO - Using connection ID 'aws_default' for task execution.
[2023-05-03, 22:53:07 UTC] {credentials.py:617} INFO - Found credentials in shared credentials file: ~/.aws/credentials
[2023-05-03, 22:53:15 UTC] {redshift_cluster.py:200} INFO - Status of cluster is pausing
[2023-05-03, 22:53:53 UTC] {redshift_cluster.py:200} INFO - Status of cluster is pausing
[2023-05-03, 22:54:32 UTC] {redshift_cluster.py:200} INFO - Status of cluster is pausing
[2023-05-03, 22:55:10 UTC] {redshift_cluster.py:200} INFO - Status of cluster is pausing
[2023-05-03, 22:55:48 UTC] {triggerer_job_runner.py:615} INFO - Trigger example_redshift/manual__2023-05-03T22:44:17.762920+00:00/pause_cluster/-1/1 (ID 2) fired: TriggerEvent<{'status': 'success', 'message': 'Cluster paused'}>

This will need to be implemented on all boto waiters if we want them to log messages during the polling period, until boto waiters support logging natively.

I've opened an issue with botocore requesting a logging feature. We'll have to see if/when they choose to add it.

Copy link
Member

@pankajastro pankajastro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

q: I have noticed that with waiter's performance of an operator has degraded, though I need to test it a bit more. @syedahsn / @pankajkoti since you have some PRs with the custom waiter did you notice something similar?

Add test for waiter failure
@syedahsn
Copy link
Contributor Author

syedahsn commented May 4, 2023

q: I have noticed that with waiter's performance of an operator has degraded, though I need to test it a bit more.

I haven't noticed any differences. If anything, previously, when I was testing without the custom waiters, I would notice that the Triggerer's async thread would get blocked multiple times. Using custom waiters, that happens a lot less now.

@syedahsn syedahsn requested a review from dstandish May 8, 2023 19:56
@syedahsn syedahsn requested a review from pankajastro May 9, 2023 18:38
syedahsn added a commit to aws-mwaa/upstream-to-airflow that referenced this pull request May 10, 2023
syedahsn added a commit to aws-mwaa/upstream-to-airflow that referenced this pull request May 10, 2023
Copy link
Member

@pankajastro pankajastro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. thanks @syedahsn 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:providers provider:amazon-aws AWS/Amazon - related issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants