Resolve naming conflict in case of multiple Airflow operators defined in single file #2399
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
After several recent PRs that refactored the component catalog and parser functionality, a bug was introduced that does not persist a
Component
object in the case where multiple operators are present in the same file.What changes were proposed in this pull request?
When parsing Airflow operators, a
Component
object is created for each operator defined in that file, and a component id is assigned. This component id is passed through from the relevant component catalog connector (recall that this class 1.) reads in the full python file, and 2.) builds a unique id based on the catalog connector type and the relevant unique-ifying keys as defined by the connector). This means that if multiple operators exist in a file, they are all getting assigned the same component id on init.Component
objects are added to the component cache keyed by their id. This results in the most recently parsed operator in a given file overwriting all previous operators also parsed from that file.This PR appends the operator classname string to the id in the case of multiple operators defined in the same file. The id for a
Component
that is one of several in a single file would now have the format:<catalog-connector-type>:<hash-of-relevant-keys>:<class_name_of_operator>
. Note that this would not require migration, as all our existing supported example operators have a 1:1 class:file ratio.A related change is also included that prevents helper classes from being parsed as operators. Classes are only parsed if the class from which they derive ends in the phrase
'Operator'
. See lines 96-102 in this PR.How was this pull request tested?
Developer's Certificate of Origin 1.1