Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Job parent hierarchy #1935

Merged
merged 6 commits into from
May 20, 2022
Merged

Job parent hierarchy #1935

merged 6 commits into from
May 20, 2022

Conversation

collado-mike
Copy link
Collaborator

@collado-mike collado-mike commented Apr 5, 2022

Problem

Database migration as described in #1928 . This implements the database migration necessary to create the new columns (job_uuid and parent_run_id in runs and parent_job_uuid in jobs.

Solution

This PR addresses only the migration. After merging this, a deployment will have the new columns and values will be present, but aren't being actively written by the code and aren't being read. The migration is intended to be fully backward compatible, so if a deployment needs to be rolled back, the original data/behavior is preserved.

I added a jobs_view that calculates the fully qualified name rather than storing the FQN directly. With this implementation, a parent job can be renamed and child jobs can automatically be renamed (we'll need a separate issue to automatically redirect, as that's not currently handled in this impl). We also use the simple name of the job to display in the UI (currently relying on parsing the dot characters in the name). If we store the FQN directly, we'll need to parse the FQN to derive the simple name for display. On the other hand, constructing the FQN after storing the simple name is easy and cheap. While the view adds some complexity, I think it gives us direct access to the information we want without forcing us to use heuristics to figure it out.

Note: All database schema changes require discussion. Please link the issue for context.

Checklist

  • You've signed-off your work
  • Your changes are accompanied by tests (if relevant)
  • Your change contains a small diff and is self-contained
  • You've updated any relevant documentation (if relevant)
  • You've updated the CHANGELOG.md with details about your change under the "Unreleased" section (if relevant, depending on the change, this may not be necessary)
  • You've versioned your .sql database schema migration according to Flyway's naming convention (if relevant)

@collado-mike collado-mike requested a review from wslulciuc April 5, 2022 17:39
@wslulciuc wslulciuc added this to the 0.22.0 milestone Apr 5, 2022
Copy link
Member

@wslulciuc wslulciuc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for entertaining my thoughts and feedback @collado-mike. Left some minor comment, and suggestion to rename a few columns, but otherwise great work! 💯 🥇

@collado-mike collado-mike modified the milestones: 0.22.0, 0.23.0 Apr 12, 2022
@collado-mike collado-mike force-pushed the job_parent_hierarchy branch 2 times, most recently from 382e3a5 to 620a18c Compare May 6, 2022 22:39
@codecov
Copy link

codecov bot commented May 6, 2022

Codecov Report

Merging #1935 (a8ccb62) into main (ac7a455) will decrease coverage by 0.03%.
The diff coverage is 78.94%.

@@             Coverage Diff              @@
##               main    #1935      +/-   ##
============================================
- Coverage     78.23%   78.20%   -0.04%     
  Complexity      955      955              
============================================
  Files           194      194              
  Lines          5293     5303      +10     
  Branches        420      420              
============================================
+ Hits           4141     4147       +6     
- Misses          706      713       +7     
+ Partials        446      443       -3     
Impacted Files Coverage Δ
api/src/main/java/marquez/db/JobDao.java 100.00% <ø> (ø)
api/src/main/java/marquez/db/JobVersionDao.java 89.39% <ø> (ø)
api/src/main/java/marquez/api/JobResource.java 92.30% <57.14%> (-5.52%) ⬇️
api/src/main/java/marquez/service/RunService.java 87.67% <66.66%> (ø)
api/src/main/java/marquez/MarquezContext.java 84.93% <100.00%> (ø)
api/src/main/java/marquez/db/OpenLineageDao.java 96.29% <100.00%> (+0.02%) ⬆️
api/src/main/java/marquez/db/RunDao.java 92.50% <100.00%> (-0.10%) ⬇️
.../src/main/java/marquez/service/LineageService.java 84.46% <100.00%> (+0.46%) ⬆️
...va/marquez/graphql/mapper/LineageResultMapper.java 63.63% <0.00%> (-3.04%) ⬇️

📣 Codecov can now indicate which changes are the most critical in Pull Requests. Learn more

Signed-off-by: Michael Collado <collado.mike@gmail.com>
Signed-off-by: Michael Collado <collado.mike@gmail.com>
…simple name to fqn

added unit test to verify

Signed-off-by: Michael Collado <collado.mike@gmail.com>
@collado-mike collado-mike force-pushed the job_parent_hierarchy branch from bdfa4a5 to a45885b Compare May 6, 2022 23:53
@collado-mike collado-mike requested a review from wslulciuc May 6, 2022 23:56
Signed-off-by: Michael Collado <collado.mike@gmail.com>
Copy link
Member

@wslulciuc wslulciuc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing my comments! Left one minor suggestion on naming, otherwise 👍

@@ -0,0 +1,44 @@
ALTER TABLE jobs ADD COLUMN IF NOT EXISTS parent_job_uuid uuid CONSTRAINT jobs_parent_fk_jobs REFERENCES jobs (uuid);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_Minor: I would also update the migration script to reflect the name of the column added:

V43__alter_jobs_add_job_parent_uuid.sql

collado-mike and others added 2 commits May 20, 2022 14:28
Signed-off-by: Michael Collado <collado.mike@gmail.com>
@collado-mike collado-mike enabled auto-merge (squash) May 20, 2022 21:28
@collado-mike collado-mike merged commit 0021a2b into main May 20, 2022
@collado-mike collado-mike deleted the job_parent_hierarchy branch May 20, 2022 21:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

2 participants