-
Notifications
You must be signed in to change notification settings - Fork 16.3k
EcsRunTaskOperator fails when no containers are provided in the response #51692
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…and poll multiple times in case the task is not yet active
|
Reviewing the code. It would be good if you can provide an example of a DAG with the EcsRunTaskOperator so I can test it on the main branch, see it fail, and then try again on this branch. The Airflow Ecs docs may be out of date because when I follow the example the API fails with an Attribute error, so something about my call is invalid even though I'm using the default example. Seeing an example would help. |
I think it is tricky to create a sample dag in this case, as we depend on whether ECS returns a container name with the call of |
|
I see. Okay, trying to reproduce. The main thing I'm trying is to beef up the resources like cpu and memory for the task to take longer to start. Any other ideas of what else I can change to make the container be in Another question I have is: to what extent is your proposed change overlapping with the intended functionality of Wouldn't a good, already viable, fix in this situation be for the user to have |
|
The We could also think about making |
|
I see. So just to confirm: the core of the issue here is that, while |
The The current solution was to simply get the container name from the return of The issue is related to the change of the cloud watch log stream prefixes introduced a while ago. |
|
Ah okay. It's quite an edge case so I can't find a way to reproduce, I think your fix and tests look good. But I have one more thing I'd like to clarify: The original PR is concerned with situations where the API response looks like the following: where the response has an empty array in the 'containers' field. But in your test case, there is a populated containers field, it just doesn't have a name. Has the scope of the issue changed, or should you also be testing the case where the whole 'containers' field is empty? |
Good catch! Will adjust the test case, thanks |
1c8222b to
8467c32
Compare
seanghaeli
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Difficult to reproduce in a DAG, but I'm happy with the proposed change and unit test coverage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I can't tell if this will fix thew issue or not, but it looks like he changes are scoped tightly enough behind the if blocks that it shouldn't break anything else.
|
thanks for having a look! I agree with both of you, it's tricky to test. That said, based on my experience, 40 seconds is usually sufficient to exit transient states in ECS, even when dealing with large images and similar overhead. But of course, it's not an exact science... |
|
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions. |
|
@ferruzzi I think we should merge this before it is deleted. |
…nse (apache#51692) * Added logic to safely access the container name only if its required and poll multiple times in case the task is not yet active * Adujusted the test case to better reflect actual ecs response
…nse (apache#51692) * Added logic to safely access the container name only if its required and poll multiple times in case the task is not yet active * Adujusted the test case to better reflect actual ecs response
…nse (apache#51692) * Added logic to safely access the container name only if its required and poll multiple times in case the task is not yet active * Adujusted the test case to better reflect actual ecs response
…nse (apache#51692) * Added logic to safely access the container name only if its required and poll multiple times in case the task is not yet active * Adujusted the test case to better reflect actual ecs response
…nse (apache#51692) * Added logic to safely access the container name only if its required and poll multiple times in case the task is not yet active * Adujusted the test case to better reflect actual ecs response
…nse (apache#51692) * Added logic to safely access the container name only if its required and poll multiple times in case the task is not yet active * Adujusted the test case to better reflect actual ecs response
…nse (apache#51692) * Added logic to safely access the container name only if its required and poll multiple times in case the task is not yet active * Adujusted the test case to better reflect actual ecs response
…nse (apache#51692) * Added logic to safely access the container name only if its required and poll multiple times in case the task is not yet active * Adujusted the test case to better reflect actual ecs response
…nse (apache#51692) * Added logic to safely access the container name only if its required and poll multiple times in case the task is not yet active * Adujusted the test case to better reflect actual ecs response
…nse (apache#51692) * Added logic to safely access the container name only if its required and poll multiple times in case the task is not yet active * Adujusted the test case to better reflect actual ecs response
…nse (apache#51692) * Added logic to safely access the container name only if its required and poll multiple times in case the task is not yet active * Adujusted the test case to better reflect actual ecs response
…nse (apache#51692) * Added logic to safely access the container name only if its required and poll multiple times in case the task is not yet active * Adujusted the test case to better reflect actual ecs response
…nse (apache#51692) * Added logic to safely access the container name only if its required and poll multiple times in case the task is not yet active * Adujusted the test case to better reflect actual ecs response
…nse (apache#51692) * Added logic to safely access the container name only if its required and poll multiple times in case the task is not yet active * Adujusted the test case to better reflect actual ecs response
…nse (apache#51692) * Added logic to safely access the container name only if its required and poll multiple times in case the task is not yet active * Adujusted the test case to better reflect actual ecs response
…nse (apache#51692) * Added logic to safely access the container name only if its required and poll multiple times in case the task is not yet active * Adujusted the test case to better reflect actual ecs response
…nse (apache#51692) * Added logic to safely access the container name only if its required and poll multiple times in case the task is not yet active * Adujusted the test case to better reflect actual ecs response
…nse (apache#51692) * Added logic to safely access the container name only if its required and poll multiple times in case the task is not yet active * Adujusted the test case to better reflect actual ecs response
…nse (apache#51692) * Added logic to safely access the container name only if its required and poll multiple times in case the task is not yet active * Adujusted the test case to better reflect actual ecs response
closes: #51429
The following adjustments make sure that the container name is only polled, if it is actually required (
awslogs_groupandawslogs_stream_prefixare set and we thus want to see the logs). Moreover I have used thedescribe_tasksmethod of boto3 to poll them after a 10s and 30s retry, to account for delays until the task has reachedRUNNING. Also the container_name is safely accessed to avoid throwing an error.In case the name is required and cannot be retrieved an info is logged.