Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No Columns or ability to add field tags when using Job Event static lineage #2843

Open
davidsharp7 opened this issue Jun 27, 2024 · 0 comments
Labels
bug Something isn't working
Milestone

Comments

@davidsharp7
Copy link
Member

davidsharp7 commented Jun 27, 2024

Given the following static lineage post

curl -X POST http://localhost:8080/api/v1/lineage \
  -i -H 'Content-Type: application/json' \
  -d '{
        "eventTime": "2024-12-28T20:52:00.001+10:00",
        "job": {
          "namespace": "my-namespace",
          "name": "newtestfoobarmeeeepppppppppp"
        },
        "outputs": [{
          "namespace": "my-namespace",
          "name": "pppppspooky",
          "facets": {
            "schema": {
              "_producer": "https://github.com/OpenLineage/OpenLineage/blob/v1-0-0/client",
              "_schemaURL": "https://github.com/OpenLineage/OpenLineage/blob/v1-0-0/spec/OpenLineage.json#/definitions/SchemaDatasetFacet",
              "fields": [
                { "name": "a", "type": "VARCHAR"},
                { "name": "b", "type": "VARCHAR"}
              ]
            }
          }
        }],
        "producer": "https://github.com/OpenLineage/OpenLineage/blob/v1-0-0/client",
        "schemaURL": "https://openlineage.io/spec/2-0-0/OpenLineage.json#/definitions/JobEvent"
      }'

it appears the columns for the datasets won't render in the UI as well as an inability to add field level tags.

Upon investigation it looks like its to do with the current dataset version is not being updated in the OpenLineageDao for the Job Event

    if (event.getInputs() != null) {
      for (Dataset dataset : event.getInputs()) {
        DatasetRecord record = upsertLineageDataset(daos, dataset, now, null, true);
        datasetInputs.add(record);
        insertDatasetFacets(daos, dataset, record, null, null, now);
        insertInputDatasetFacets(daos, dataset, record, null, null, now);
      }
    }

by adding the following the current version is updated in the datasets table

        daos.getDatasetDao()
        .updateVersion(
            record.getDatasetVersionRow().getDatasetUuid(),
            Instant.now(),
            record.getDatasetVersionRow().getUuid());
            

which resolves the columns being displayed.

There is subsequent step where we would need to propagate the tags which are linked to to the dataset version fields. Looks like we can use the dao

        List<Field> dsvTags = daos.getDatasetFieldDao().findByDatasetVersion(record.getDatasetVersionRow().getUuid());
        daos.getDatasetVersionDao().updateFields(
record.getDatasetVersionRow().getUuid(), daos.getDatasetVersionDao().toPgObjectFields(dsvTags));

@wslulciuc does that sound like a fair way of doing it?

@wslulciuc wslulciuc added the bug Something isn't working label Oct 23, 2024
@wslulciuc wslulciuc added this to the 0.51.0 milestone Oct 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: No status
Development

No branches or pull requests

2 participants