-
Notifications
You must be signed in to change notification settings - Fork 16.3k
Description
Apache Airflow Provider(s)
openlineage
Versions of Apache Airflow Providers
apache-airflow-providers-openlineage 1.12.2
Apache Airflow version
2.9.3
Operating System
Linux
Deployment
Official Apache Airflow Helm Chart
Deployment details
No response
What happened
The AirflowRunFacet.json currently isn't usable to validate OpenLineage logs that contain the AirflowRunFacet:
https://github.com/apache/airflow/blob/main/providers/src/airflow/providers/openlineage/facets/AirflowRunFacet.json#L175
This is due to the type of the tags element (on lines 175-177):
"tags": {
"type": "string"
}
The logs that get produced by the openlineage provider contain lists of strings as tags. When one tag is given to a DAG this is still a list that contains 1 string, when multiple tags are given to a DAG the list contains all these strings.
What you think should happen instead
The tags field in AirflowRunFacet.json should be changed as follows:
"tags": {
"type": "array",
"items": {
"type": "string"
}
}
Which validates that it is indeed an array of strings.
How to reproduce
Validate any given openlineage log that contains the airflow run facet with the current AirflowRunFacet.json and the validation will fail.
Anything else
No response
Are you willing to submit PR?
- Yes I am willing to submit a PR!
Code of Conduct
- I agree to follow this project's Code of Conduct