Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix 14040: Part 1 Remove get_by_name calls from topology #14098

Merged
merged 14 commits into from
Nov 27, 2023

Conversation

OnkarVO7
Copy link
Contributor

Describe your changes:

Remove get_by_name calls from topology

Type of change:

  • Bug fix
  • Improvement
  • New feature
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation

Checklist:

  • I have read the CONTRIBUTING document.
  • My PR title is Fixes <issue-number>: <short explanation>
  • I have commented on my code, particularly in hard-to-understand areas.
  • For JSON Schema changes: I updated the migration scripts or explained why it is not needed.

@OnkarVO7 OnkarVO7 requested a review from a team as a code owner November 24, 2023 07:03
@OnkarVO7 OnkarVO7 marked this pull request as draft November 24, 2023 07:03
@github-actions github-actions bot added Ingestion safe to test Add this label to run secure Github workflows on PRs labels Nov 24, 2023
Copy link
Contributor

The Python checkstyle failed.

Please run make py_format and py_format_check in the root of your repository and commit the changes to this PR.
You can also use pre-commit to automate the Python code formatting.

You can install the pre-commit hooks with make install_test precommit_install.

Copy link
Contributor

The Python checkstyle failed.

Please run make py_format and py_format_check in the root of your repository and commit the changes to this PR.
You can also use pre-commit to automate the Python code formatting.

You can install the pre-commit hooks with make install_test precommit_install.

Copy link
Contributor

The Python checkstyle failed.

Please run make py_format and py_format_check in the root of your repository and commit the changes to this PR.
You can also use pre-commit to automate the Python code formatting.

You can install the pre-commit hooks with make install_test precommit_install.

Copy link
Contributor

The Python checkstyle failed.

Please run make py_format and py_format_check in the root of your repository and commit the changes to this PR.
You can also use pre-commit to automate the Python code formatting.

You can install the pre-commit hooks with make install_test precommit_install.

Copy link
Contributor

The Python checkstyle failed.

Please run make py_format and py_format_check in the root of your repository and commit the changes to this PR.
You can also use pre-commit to automate the Python code formatting.

You can install the pre-commit hooks with make install_test precommit_install.

for dependency in stage.consumer or [] # root nodes do not have consumers
]
return fqn._build( # pylint: disable=protected-access
*context_names, entity_request.name.__root__
)

def update_context(self, stage: NodeStage, entity: Entity):
def update_context(self, stage: NodeStage, entity_name: str):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll store the entity_name in the topology context instead of the entity_fqn and build the fqn on the fly wherever required. This is mainly because we need the entity_name in few places like here and here

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good! We can add a comment about this in the code itself

@OnkarVO7 OnkarVO7 marked this pull request as ready for review November 27, 2023 05:02

# We have ack the sink waiting for a response, but got nothing back
if stage.must_return and entity is None:
if stage.must_return and entity_name is None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does it make sense to have the must_return anymore? We are not running any GET so we might as well skip this validation I guess.

Another approach which I might like a bit better is to keep a must_return only for important pieces, such as the Services. Those we want to be able to GET all the time otherwise there's no point continuing the ingestion

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a get_by_name call only if the must_return attribute is kept true

@@ -283,7 +264,7 @@ def _(
)

# We want to keep the full payload in the context
self.update_context(stage=stage, entity=right)
self.update_context(stage=stage, entity_name=right)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in this method we are missing removing the GET I believe

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and same as with the lineage, not sure why we are passing the full right to the entity_name

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modified the call

@@ -245,7 +226,7 @@ def _(
lineage has been properly drawn. We'll skip the process for now.
"""
yield entity_request
self.update_context(stage=stage, entity=right)
self.update_context(stage=stage, entity_name=right)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can update this call. Not sure why we are passing the full right to the entity_name

service_name=self.context.dashboard_service,
data_model_name=datamodel,
)
datamodel_entity = self.metadata.get_by_name(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we running a GET here?

@OnkarVO7 OnkarVO7 requested a review from pmbrull November 27, 2023 10:32
Copy link
Collaborator

@pmbrull pmbrull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are still adding some GETs required for the lineage. We will address this in a followup PR caching OM's state

Copy link

[open-metadata-ingestion] Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

42.3% 42.3% Coverage
6.0% 6.0% Duplication

@OnkarVO7 OnkarVO7 merged commit ebb2317 into main Nov 27, 2023
12 checks passed
@OnkarVO7 OnkarVO7 deleted the rmv_get_by_name_topology branch November 27, 2023 10:45
OnkarVO7 added a commit that referenced this pull request Dec 5, 2023
* Changed for database

* Added changes for dashboard_service

* Changed for messaging service

* Changed for mlmodel service

* Changed for pipeline service

* Changed for search service

* Changed for objectstore service

* fixed wrong import

* fixed lint

* fixes

* fixed pytests

* fixed domo db pytest

* Fixed review comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Ingestion safe to test Add this label to run secure Github workflows on PRs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants