-
Notifications
You must be signed in to change notification settings - Fork 328
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
api: fill data in column lineage input nodes. #2742
api: fill data in column lineage input nodes. #2742
Conversation
✅ Deploy Preview for peppy-sprite-186812 canceled.
|
dfef1ae
to
dbf7793
Compare
Signed-off-by: Jakub Dardzinski <kuba0221@gmail.com>
dbf7793
to
65c7c8d
Compare
this.transformationDescription = data.transformationDescription; | ||
this.transformationType = data.transformationType; | ||
this.inputFields = ImmutableList.of(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little confused by some of these fields like fieldType
and inputFields
. Would these ever be set here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The truth is that if I set InputFieldNodeData
to implement NodeData
interface jackson couldn't deduce what was type of NodeData
was in the output node (the one that previously had data
field null). For that reason I basically made a workaround to simply create ColumnLineageNodeData
from InputFieldNodeData
.
I'd need more guidance how to fix jackson deduction :|
@wslulciuc @phixMe - while talking to @JDarDagran, he mentioned that there were concerns this PR may not solve the original issue and that you'd be logging an issue with further details. Could you confirm if that is the case? |
@tatiana: yeah, although this is a "fix", I think the issue is more fundemtail to our colum lineage query. That is, @phix and I looked into this a few days ago and walked through every OL call (@JDarDagran provided us) and it was enlightening. I’d like to share what we found, just haven’t had a window of time to write up our findings but the tl;dr is: the output_dataset_field_uuid in our column lineage query doesn’t return values for the ColumnLineageNodes on the edges of the graph (ie. graph nodes with no input columns, see below). I’d like to further confirm this, but given that our seed data works just fine, there’s something we are not accounting for. |
Hi @wslulciuc, I spent some time investigating nulls and I think @JDarDagran's PR solves the problem we were able to reproduce. The PR modifies a test and then applies changes need to fix it. That's why I think it is worth merging from that perspective. What do you say? Once you provide more details on other issue, we can resume discussion while not blocking this. |
@pawel-big-lebowski yeah, I'll merge 😉. I'm fine with that as we don't want to further delay the merge of #2725. Mind opening an issue to dive deeper into the column-level lineage query and move our discussion there? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the fix, @JDarDagran! We'll want to open an issue to dive deeper into the column-level lineage query and move our discussion there. But, giving this a +1 to not block #2742
Problem
As stated in #2725 data is null for output nodes in column lineage endpoint.
Solution
Fill data field.
Checklist
CHANGELOG.md
(Depending on the change, this may not be necessary)..sql
database schema migration according to Flyway's naming convention (if relevant)