Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(ui): Add support for structured reporting of warnings and failures in the UI ingestion flow (ingest uplift 2/2) #10790

Conversation

jjoyce0510
Copy link
Collaborator

@jjoyce0510 jjoyce0510 commented Jun 27, 2024

Summary

This PR is the first step in structured error reporting visible to the admin via the UI. In this PR, we add a way for visualizing failure and warnings from the ingestion run directly inside the UI, on the ingestion results modal.

We add base support for some error types. We still need to do the proper mapping, so for now most warnings are handled in the default manner shown below, with the following details:

  • title: "an unknown issue occurred"
  • message: "whatever the error type reported by ingestion source is"

Ideally, over time we'll migrate ingestion to start reporting better formed errors. I've also added support pre-emptively for a new format for failure and warning reporting:

structured_logs: [ { level: "WARN", type: "some-warning-type", message: "A human readable warning message", context: ["additional useful context"]

We also add a new "status" type called "Succeeded with Warnings" which is automatically displayed when there are more than 1 warnings reported by the ingestion source during the run!

Once this is merged, we will be incentivized to improve error reporting in ingestion to give the user actionable reports.

Status

Ready for review

Screenshots

Screenshot 2024-06-26 at 6 43 18 PM Screenshot 2024-06-26 at 6 52 16 PM Screenshot 2024-06-26 at 6 09 26 PM

Checklist

  • The PR conforms to DataHub's Contributing Guideline (particularly Commit Message Format)
  • Links to related issues (if applicable)
  • Tests for the changes have been added/updated (if applicable)
  • Docs related to the changes have been added/updated (if applicable). If a new feature has been added a Usage Guide has been added for the same.
  • For any breaking change/potential downtime/deprecation/big changes an entry has been made in Updating DataHub

Summary by CodeRabbit

  • New Features

    • Introduced structured reports for ingestion processes, providing detailed insight with error, warning, and info levels through a new StructuredReport component.
    • Enhanced ingestion source status display with improved status detection and representation.
  • Improvements

    • Enhanced execution request details modal to display structured reports.
    • Added a "Show More" section to dynamically display additional items based on user interaction.
  • GraphQL

    • Updated listIngestionSources and getIngestionSource queries to include structured report information.

@github-actions github-actions bot added the product PR or Issue related to the DataHub UI/UX label Jun 27, 2024
John Joyce added 3 commits June 27, 2024 10:43
Copy link
Contributor

coderabbitai bot commented Jun 27, 2024

Warning

Rate limit exceeded

@jjoyce0510 has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 28 minutes and 4 seconds before requesting another review.

How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

Commits

Files that changed from the base of the PR and between 9a6f2a7 and 224c630.

Walkthrough

This update enhances how ingestion status is determined and displayed within DataHub by introducing structured reports for better analysis of execution requests. Key changes include importing a new getIngestionSourceStatus function, integrating new structured report components, and updating GraphQL queries to include structured report data.

Changes

Files/Paths Summary of Changes
IngestionSourceTable.tsx Integrates getIngestionSourceStatus function to determine status.
ExecutionRequestDetailsModal.tsx Updates status handling and includes a structured report component.
IngestionExecutionTable.tsx Uses getIngestionSourceStatus to determine execution statuses.
StructuredReport.tsx, StructuredReportItem.tsx, StructuredReportItemContext.tsx, StructuredReportItemList.tsx Introduces components for displaying structured ingestion reports and items.
types.ts Adds enums, interfaces, and constants for structured report items and levels.
utils.ts Adds utility functions for handling and displaying structured reports.
ShowMoreSection.tsx New component to handle displaying of additional items with a "show more" feature.
ingestion.graphql Extends listIngestionSources and getIngestionSource queries to include structuredReport data.

Poem

In the realm of DataHub, where codes do play,
Structured reports now guide the way. 🐇
Execution's tale, in detail unfurls,
With icons, warnings, as data twirls.
Show More buttons, reveal with grace,
In the world of data, we find our place.
Coding dreams, a rabbit's chase! 🌟


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share
Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai generate interesting stats about this repository and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 45a8cc9 and 9a6f2a7.

Files selected for processing (11)
  • datahub-web-react/src/app/ingest/source/IngestionSourceTable.tsx (2 hunks)
  • datahub-web-react/src/app/ingest/source/executions/ExecutionRequestDetailsModal.tsx (5 hunks)
  • datahub-web-react/src/app/ingest/source/executions/IngestionExecutionTable.tsx (2 hunks)
  • datahub-web-react/src/app/ingest/source/executions/reporting/StructuredReport.tsx (1 hunks)
  • datahub-web-react/src/app/ingest/source/executions/reporting/StructuredReportItem.tsx (1 hunks)
  • datahub-web-react/src/app/ingest/source/executions/reporting/StructuredReportItemContext.tsx (1 hunks)
  • datahub-web-react/src/app/ingest/source/executions/reporting/StructuredReportItemList.tsx (1 hunks)
  • datahub-web-react/src/app/ingest/source/types.ts (1 hunks)
  • datahub-web-react/src/app/ingest/source/utils.ts (7 hunks)
  • datahub-web-react/src/app/shared/ShowMoreSection.tsx (1 hunks)
  • datahub-web-react/src/graphql/ingestion.graphql (2 hunks)
Additional context used
Biome
datahub-web-react/src/app/shared/ShowMoreSection.tsx

[error] 28-28: Avoid using unnecessary Fragment.

A fragment is redundant if it contains only one child, or if it is the child of a html element, and is not a keyed fragment.
Unsafe fix: Remove the Fragment

(lint/complexity/noUselessFragments)

datahub-web-react/src/app/ingest/source/executions/reporting/StructuredReportItemContext.tsx

[error] 39-39: Missing key property for this element in iterable.

The order of the items may change, and having a key can help React identify which item was moved.
Check the React documentation.

(lint/correctness/useJsxKeyInIterable)

Additional comments not posted (14)
datahub-web-react/src/app/shared/ShowMoreSection.tsx (1)

5-15: Styling for ShowMoreButton looks good.

The styling is consistent and uses theme constants, which is good for maintainability.

datahub-web-react/src/app/ingest/source/executions/reporting/StructuredReportItemList.tsx (1)

20-41: Good use of React state and conditional rendering.

The implementation of visible items and the conditional rendering of the 'Show More' section is well-handled. The use of keys in the map function is correctly implemented to ensure efficient updates.

datahub-web-react/src/app/ingest/source/executions/reporting/StructuredReport.tsx (1)

23-43: Efficient filtering and conditional rendering of report items.

The component efficiently filters and conditionally renders report items based on their level. The use of separate components for different levels enhances modularity.

datahub-web-react/src/app/ingest/source/executions/reporting/StructuredReportItem.tsx (1)

59-81: Well-structured component with good use of external context.

The component is well-structured and makes good use of the StructuredReportItemContext for displaying additional details. The use of a dynamic icon and styled components adds to the flexibility and aesthetic of the component.

datahub-web-react/src/graphql/ingestion.graphql (2)

42-45: Addition of structuredReport field in listIngestionSources query.

The addition of the structuredReport field allows fetching structured error and warning information related to ingestion sources. Ensure that the backend appropriately supports this field and that it is properly documented.

Verification successful

Backend support for structuredReport field verified.

The structuredReport field is indeed supported in the backend, as evidenced by its implementation and usage in multiple TypeScript files.

  • datahub-web-react/src/app/ingest/source/utils.ts
  • datahub-web-react/src/app/ingest/source/executions/ExecutionRequestDetailsModal.tsx
  • datahub-web-react/src/app/ingest/source/builder/RecipeForm/TestConnection/TestConnectionButton.tsx
Scripts executed

The following scripts were executed for the analysis:


Script:

#!/bin/bash
# Description: Verify the backend support for `structuredReport` field.

# Test: Search for the `structuredReport` resolver implementation.
rg --type graphql "structuredReport"

Length of output: 68



Script:

#!/bin/bash
# Description: Verify the backend support for `structuredReport` field.

# Test: Search for the `structuredReport` resolver implementation in TypeScript files.
rg "structuredReport" --type ts

Length of output: 2250


93-96: Addition of structuredReport field in getIngestionSource query.

Similar to the previous query, this addition enhances the detail level of ingestion source reporting. This should be checked for backend support and integration with the front-end components that utilize this data.

Verification successful

Backend support for structuredReport field in getIngestionSource query is confirmed.

The structuredReport field is handled in the backend, specifically in the IngestionResolverUtils.java file, which confirms that this addition is supported.

  • datahub-graphql-core/src/main/java/com/linkedin/datahub/graphql/resolvers/ingest/IngestionResolverUtils.java
Scripts executed

The following scripts were executed for the analysis:


Script:

#!/bin/bash
# Description: Verify the backend support for `structuredReport` field in the `getIngestionSource` query.

# Test: Search for the `structuredReport` resolver implementation.
rg --type graphql "structuredReport"

Length of output: 68



Script:

#!/bin/bash
# Description: Verify the backend support for `structuredReport` field in the `getIngestionSource` query.

# Find all .graphql files and search for the term 'structuredReport' within them.
fd --extension graphql --exec rg "structuredReport" {}

Length of output: 206



Script:

#!/bin/bash
# Description: Verify the backend support for `structuredReport` field in the `getIngestionSource` query.

# Find all .graphql files and search for the term 'structuredReport' within them, including file names and line numbers for context.
fd --extension graphql --exec rg -n -C 5 "structuredReport" {}

Length of output: 1265



Script:

#!/bin/bash
# Description: Verify the backend support for `structuredReport` field in the `getIngestionSource` query.

# Search for the implementation of the `structuredReport` resolver in the backend code.
rg "structuredReport" --type java
rg "structuredReport" --type typescript

Length of output: 1153

datahub-web-react/src/app/ingest/source/IngestionSourceTable.tsx (1)

6-6: Refactor to use getIngestionSourceStatus for status determination.

This change is consistent with the previous file, aiming to centralize status determination logic. Verify that getIngestionSourceStatus handles all edge cases and integrates well with the UI components that display these statuses.

Also applies to: 126-126

datahub-web-react/src/app/ingest/source/types.ts (1)

1-195: Comprehensive addition of types and enums for structured reporting.

The new enums and interfaces are well-structured and cater to the needs of structured reporting within the application. Ensure that these types are used consistently across the application and that any potential naming conflicts with existing types are resolved.

datahub-web-react/src/app/ingest/source/executions/ExecutionRequestDetailsModal.tsx (1)

16-17: Enhancements to include structured reporting in execution details.

The integration of structured reporting via the StructuredReport component and related utility functions enhances the detail level of execution request insights. Verify that these components and functions are well-integrated and perform as expected in various scenarios.

Also applies to: 21-22, 132-132, 143-148, 152-152, 191-191

Verification successful

Enhancements to include structured reporting in execution details are verified.

The StructuredReport component and related utility functions (getIngestionSourceStatus, getStructuredReport) are well-integrated within the codebase. They appear in multiple relevant files, including ExecutionRequestDetailsModal.tsx, and are used consistently.

  • StructuredReport is imported and used in ExecutionRequestDetailsModal.tsx and other files under the reporting directory.
  • getIngestionSourceStatus and getStructuredReport are utilized in ExecutionRequestDetailsModal.tsx and other related files, ensuring their functionality is integrated into the execution details.
Scripts executed

The following scripts were executed for the analysis:


Script:

#!/bin/bash
# Description: Verify integration of structured reporting components.

# Test: Search for integration tests involving `StructuredReport` and related utility functions.
rg --type javascript "StructuredReport"

Length of output: 74



Script:

#!/bin/bash
# Description: Verify integration of structured reporting components and utility functions.

# Test: Search for occurrences of `StructuredReport`, `getIngestionSourceStatus`, and `getStructuredReport` in TypeScript files.
rg --type typescript "StructuredReport"
rg --type typescript "getIngestionSourceStatus"
rg --type typescript "getStructuredReport"

Length of output: 235



Script:

#!/bin/bash
# Description: Verify integration of structured reporting components and utility functions.

# Test: Search for occurrences of `StructuredReport`, `getIngestionSourceStatus`, and `getStructuredReport` in .tsx and .ts files.
rg --glob '*.tsx' "StructuredReport"
rg --glob '*.ts' "StructuredReport"
rg --glob '*.tsx' "getIngestionSourceStatus"
rg --glob '*.ts' "getIngestionSourceStatus"
rg --glob '*.tsx' "getStructuredReport"
rg --glob '*.ts' "getStructuredReport"

Length of output: 13853

datahub-web-react/src/app/ingest/source/utils.ts (5)

11-11: Approved: Import and constant additions are essential for new features.

The imports and constants added are crucial for supporting the new structured error reporting and the 'Succeeded with Warnings' status. These changes align well with the PR objectives.

Also applies to: 52-52, 71-71, 86-86, 103-104, 124-124


148-187: Approved: New utility functions for structured reporting are well-implemented.

The functions for mapping and creating structured reports are crucial for the new error reporting feature. They are implemented efficiently and handle various scenarios, including legacy data formats.

Suggestion for Improvement: Graceful handling of unknown types.
[REFACTOR_SUGGESTion]
Consider providing a default message or logging a warning when encountering unknown types, which could aid in debugging and maintenance.

- return StructuredReportItemType.UNKNOWN ? rawType : STRUCTURED_REPORT_ITEM_TYPE_TO_DETAILS.get(stdType)?.message;
+ return StructuredReportItemType.UNKNOWN ? `Unknown type: ${rawType}` : STRUCTURED_REPORT_ITEM_TYPE_TO_DETAILS.get(stdType)?.message;

Also applies to: 189-249


269-287: Approved: Enhanced status handling in getIngestionSourceStatus.

The modifications to getIngestionSourceStatus are crucial for supporting the new 'Succeeded with Warnings' status. This change allows the UI to more accurately reflect the state of an ingestion process.

Suggestion for Improvement: Enhance documentation for clarity.

Adding more detailed comments explaining why the status might be overridden could help future maintainers understand the decision logic more clearly.

+    // Check if the ingestion completed successfully but with warnings, and adjust the status accordingly.
     if (status === SUCCESS && (structuredReport?.warnCount || 0) > 0) {
         return SUCCEEDED_WITH_WARNINGS;
     }

189-249: Approved: Effective handling of structured and legacy report formats in transformToStructuredReport.

This function is well-designed to accommodate both new structured report data and legacy formats, ensuring that the application can handle data from various versions seamlessly. This is crucial for maintaining functionality across different deployment states.


Line range hint 23-50: Review of remaining elements: No action needed.

The remaining parts of the file, including some utility functions and constants, are unchanged in this PR and continue to serve their existing purposes without modifications.

Also applies to: 53-70, 87-102, 125-147, 252-268

datahub-web-react/src/app/shared/ShowMoreSection.tsx Outdated Show resolved Hide resolved
@@ -4,7 +4,7 @@ import styled from 'styled-components';
import { StyledTable } from '../../../entity/shared/components/styled/StyledTable';
import { ExecutionRequest } from '../../../../types.generated';
import { ButtonsColumn, SourceColumn, StatusColumn, TimeColumn } from './IngestionExecutionTableColumns';
import { SUCCESS } from '../utils';
import { SUCCESS, getIngestionSourceStatus } from '../utils';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tip

Codebase Verification

Add test cases for getIngestionSourceStatus.

The function getIngestionSourceStatus is used in multiple places in the codebase, but there are no test cases found for it. To ensure reliability and maintainability, please add comprehensive test cases for this function.

  • Files using getIngestionSourceStatus:
    • datahub-web-react/src/app/ingest/source/IngestionSourceTable.tsx
    • datahub-web-react/src/app/ingest/source/executions/IngestionExecutionTable.tsx
    • datahub-web-react/src/app/ingest/source/executions/ExecutionRequestDetailsModal.tsx
    • datahub-web-react/src/app/ingest/source/utils.ts
Analysis chain

Refactor to use getIngestionSourceStatus for status determination.

The change centralizes the logic for determining the status of ingestion sources, which is a good practice for maintainability. Ensure that the getIngestionSourceStatus function is thoroughly tested to handle all possible statuses correctly.

Also applies to: 115-115

Scripts executed

The following scripts were executed for the analysis:


Script:

#!/bin/bash
# Description: Verify the correct usage and test coverage of `getIngestionSourceStatus`.

# Test: Search for the test cases for `getIngestionSourceStatus`.
rg --type javascript "getIngestionSourceStatus"

Length of output: 82



Script:

#!/bin/bash
# Description: Verify the correct usage and test coverage of `getIngestionSourceStatus`.

# Search for the usage of `getIngestionSourceStatus` in TypeScript and TSX files.
rg --type-add 'tsx:*.{ts,tsx}' "getIngestionSourceStatus"

# Search for test cases for `getIngestionSourceStatus` in TypeScript and TSX test files.
rg --type-add 'tsx:*.{ts,tsx}' "getIngestionSourceStatus" --glob '*.{test,spec}.{ts,tsx}'

Length of output: 1107

jjoyce0510 and others added 2 commits June 27, 2024 13:41
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
@jjoyce0510 jjoyce0510 changed the title feat(ui): Add support for structured reporting of warnings and failures in the UI ingestion flow feat(ui): Add support for structured reporting of warnings and failures in the UI ingestion flow (ingest uplift 2/2) Jun 27, 2024
@jjoyce0510 jjoyce0510 merged commit c7f83a3 into datahub-project:master Jul 1, 2024
35 of 36 checks passed
yoonhyejin pushed a commit that referenced this pull request Jul 16, 2024
…es in the UI ingestion flow (ingest uplift 2/2) (#10790)

Co-authored-by: John Joyce <john@Johns-MBP.lan>
Co-authored-by: John Joyce <john@ip-192-168-1-200.us-west-2.compute.internal>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
aviv-julienjehannet pushed a commit to aviv-julienjehannet/datahub that referenced this pull request Jul 17, 2024
…es in the UI ingestion flow (ingest uplift 2/2) (datahub-project#10790)

Co-authored-by: John Joyce <john@Johns-MBP.lan>
Co-authored-by: John Joyce <john@ip-192-168-1-200.us-west-2.compute.internal>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
product PR or Issue related to the DataHub UI/UX
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant