Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ingestion Pipeline to store run status #11857

Closed
2 of 3 tasks
pmbrull opened this issue Jun 2, 2023 · 2 comments · Fixed by #14841
Closed
2 of 3 tasks

Ingestion Pipeline to store run status #11857

pmbrull opened this issue Jun 2, 2023 · 2 comments · Fixed by #14841
Assignees
Labels
backend Ingestion P0 Highest priority UI UI specific issues

Comments

@pmbrull
Copy link
Collaborator

pmbrull commented Jun 2, 2023

Currently, ingestion pipelines have a pipelineStatus field which stores general run timings and state.

On top of this, we should publish the assets being processed at source/processor/sink, etc.

  • Ingestion workflows have a centralized approach to storing Status
  • Backend stores the necessary status and sends it with the API
  • UI shows the status of past ingestions
@pmbrull
Copy link
Collaborator Author

pmbrull commented Dec 12, 2023

include log link

@pmbrull pmbrull added the UI UI specific issues label Dec 19, 2023
@pmbrull
Copy link
Collaborator Author

pmbrull commented Dec 20, 2023

the http://localhost:8585/api/v1/services/ingestionPipelines/<ingestion>/pipelineStatus?startTs=t1&endTs=t2 call gives us the following answer now.

We have added the status field which is a list of objects with:

  • name: name of the step in the workflow
  • records: assets being processed
  • warnings: warning number
  • errors: number of assets that raised errors
  • filtered: assets filtered
  • failures: a list of errors with:
    • name: asset raising the error
    • error: exception being raised
    • stackTrace: exception stack
{
    "data": [
        {
            "runId": "6a3c7bc4-be99-406e-a997-bb6240237dc2",
            "pipelineState": "failed",
            "startDate": 1703082125205,
            "timestamp": 1703082125205,
            "endDate": 1703082130278,
            "status": [
                {
                    "name": "Source",
                    "records": 8,
                    "warnings": 0,
                    "errors": 5,
                    "filtered": 0,
                    "failures": [
                        {
                            "name": "metadata_service_entity",
                            "error": "Unexpected exception to yield table [metadata_service_entity]: OH NO!!",
                            "stackTrace": "Traceback (most recent call last):\n  File \"/home/airflow/.local/lib/python3.10/site-packages/metadata/ingestion/source/database/common_db_source.py\", line 402, in yield_table\n    raise ValueError(\"OH NO!!\")\nValueError: OH NO!!\n"
                        },
                        {
                            "name": "openmetadata_settings",
                            "error": "Unexpected exception to yield table [openmetadata_settings]: OH NO!!",
                            "stackTrace": "Traceback (most recent call last):\n  File \"/home/airflow/.local/lib/python3.10/site-packages/metadata/ingestion/source/database/common_db_source.py\", line 402, in yield_table\n    raise ValueError(\"OH NO!!\")\nValueError: OH NO!!\n"
                        },
                        {
                            "name": "pipeline_service_entity",
                            "error": "Unexpected exception to yield table [pipeline_service_entity]: OH NO!!",
                            "stackTrace": "Traceback (most recent call last):\n  File \"/home/airflow/.local/lib/python3.10/site-packages/metadata/ingestion/source/database/common_db_source.py\", line 402, in yield_table\n    raise ValueError(\"OH NO!!\")\nValueError: OH NO!!\n"
                        },
                        {
                            "name": "query_entity",
                            "error": "Unexpected exception to yield table [query_entity]: OH NO!!",
                            "stackTrace": "Traceback (most recent call last):\n  File \"/home/airflow/.local/lib/python3.10/site-packages/metadata/ingestion/source/database/common_db_source.py\", line 402, in yield_table\n    raise ValueError(\"OH NO!!\")\nValueError: OH NO!!\n"
                        },
                        {
                            "name": "type_entity",
                            "error": "Unexpected exception to yield table [type_entity]: OH NO!!",
                            "stackTrace": "Traceback (most recent call last):\n  File \"/home/airflow/.local/lib/python3.10/site-packages/metadata/ingestion/source/database/common_db_source.py\", line 402, in yield_table\n    raise ValueError(\"OH NO!!\")\nValueError: OH NO!!\n"
                        }
                    ]
                },
                {
                    "name": "Sink",
                    "records": 8,
                    "warnings": 0,
                    "errors": 0,
                    "filtered": 0,
                    "failures": []
                }
            ]
        },
        {
            "runId": "464c9cd9-697d-4da2-b504-5a19eafa22eb",
            "pipelineState": "partialSuccess",
            "startDate": 1703082003156,
            "timestamp": 1703082003156,
            "endDate": 1703082011823,
            "status": [
                {
                    "name": "Source",
                    "records": 69,
                    "warnings": 0,
                    "errors": 2,
                    "filtered": 0,
                    "failures": [
                        {
                            "name": "data_insight_chart",
                            "error": "Unexpected exception to yield table [data_insight_chart]: OH NO!!",
                            "stackTrace": "Traceback (most recent call last):\n  File \"/home/airflow/.local/lib/python3.10/site-packages/metadata/ingestion/source/database/common_db_source.py\", line 402, in yield_table\n    raise ValueError(\"OH NO!!\")\nValueError: OH NO!!\n"
                        },
                        {
                            "name": "report_data_time_series",
                            "error": "Unexpected exception to yield table [report_data_time_series]: OH NO!!",
                            "stackTrace": "Traceback (most recent call last):\n  File \"/home/airflow/.local/lib/python3.10/site-packages/metadata/ingestion/source/database/common_db_source.py\", line 402, in yield_table\n    raise ValueError(\"OH NO!!\")\nValueError: OH NO!!\n"
                        }
                    ]
                },
                {
                    "name": "Sink",
                    "records": 69,
                    "warnings": 0,
                    "errors": 0,
                    "filtered": 0,
                    "failures": []
                }
            ]
        }
    ],
    "paging": {
        "before": "MTcwMjk5NTc5MTg1NQ==",
        "after": "MTcwMzA4MjE5MTg1NQ==",
        "total": 2
    }
}

A very rough way to showcase this info could be
image

maybe the modal should also have arrows to navigate to next/previous status from the list in case users want to further explore this

pmbrull added a commit that referenced this issue Dec 22, 2023
* Register StackTraceError in spec

* Register StackTraceError in spec

* Register StackTraceError in spec

* Add todos

* Update status

* docs

* format

* Fix tests

* Fix tests

* Fix tests

* Ignore generated

* Fix tests

* Fix tests

* Tests

* Try constants

* Try constants

* Print

* Print

* Print

* order

* Fix service name

* fix ui error

---------

Co-authored-by: Chirag Madlani <12962843+chirag-madlani@users.noreply.github.com>
Sachin-chaurasiya added a commit that referenced this issue Jan 25, 2024
#14841)

* feat(ui): supported: #11857 show detailed run status for ingestion pipelines

* fix user page issue and unit test for ingestion run status

* Refactor IngestionRecentRun.test.tsx to use spread operator for executionRuns

* fix unit test

---------

Co-authored-by: Sachin Chaurasiya <sachinchaurasiyachotey87@gmail.com>
@Sachin-chaurasiya Sachin-chaurasiya moved this from Ingestion - Bugs & Minor Features to Done in Release 1.3.0 Jan 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend Ingestion P0 Highest priority UI UI specific issues
Projects
No open projects
Status: Done
Development

Successfully merging a pull request may close this issue.

2 participants