feat(mcl-processor): Update mcl processor hooks #11134

david-leifker · 2024-08-09T15:02:09Z

Allow MCL Hooks to consume messages independently with different consumer groups. By default no change is made to the MCL consumer group names and all hooks share the same consumer group.

Reasons for separate consumer groups include parallel hook execution and prioritization.

Checklist

The PR conforms to DataHub's Contributing Guideline (particularly Commit Message Format)
Links to related issues (if applicable)
Tests for the changes have been added/updated (if applicable)
Docs related to the changes have been added/updated (if applicable). If a new feature has been added a Usage Guide has been added for the same.
For any breaking change/potential downtime/deprecation/big changes an entry has been made in Updating DataHub

Summary by CodeRabbit

New Features
- Introduced new classes for processing metadata change logs, enhancing event hook functionality.
- Added consumerGroupSuffix configuration for improved consumer group management in various hooks.
- Enhanced documentation regarding configuration and functionality of Kafka consumer hooks.
Bug Fixes
- Enhanced error handling and logging capabilities in metadata processing.
Documentation
- Updated documentation to include new environment variables and configuration options for consumer groups.
Refactor
- Streamlined import statements and constructors for clarity and maintainability across various classes.

coderabbitai · 2024-08-09T15:02:29Z

Walkthrough

The recent changes enhance the Kafka consumer functionalities within the metadata jobs, emphasizing metadata change logs and notifications. New classes and methods improve event handling and listener registration, alongside increased configurability through the addition of consumer group suffixes. This refactoring aims to streamline interactions and clarify the codebase.

Changes

File(s)	Change Summary
`metadata-jobs/.../MaeConsumerApplication.java`	Removed Kafka factory imports; added imports for change notifications and events.
`metadata-jobs/.../MCLKafkaListener.java`	Introduced `MCLKafkaListener` class for processing metadata change logs with a new `consume` method.
`metadata-jobs/.../MCLKafkaListenerRegistrar.java`	Added `MCLKafkaListenerRegistrar` class for listener registration and management.
`metadata-jobs/.../hook/MetadataChangeLogHook.java`	Added `getConsumerGroupSuffix()` method to the interface.
`metadata-jobs/.../hook/UpdateIndicesHook.java`	Updated constructor to include `consumerGroupSuffix` and added a new field.
`metadata-jobs/.../hook/event/EntityChangeEventGeneratorHook.java`	Added `consumerGroupSuffix` parameter to the constructor for enhanced usability.
`metadata-jobs/.../hook/form/FormAssignmentHook.java`	Modified constructor to include `consumerGroupSuffix`; renamed private fields for clarity.
`metadata-jobs/.../hook/incident/IncidentsSummaryHook.java`	Refactored constructor to include `consumerGroupSuffix`.
`metadata-jobs/.../hook/ingestion/IngestionSchedulerHook.java`	Updated constructor to accept `consumerGroupSuffix`.
`metadata-jobs/.../hook/siblings/SiblingAssociationHook.java`	Enhanced constructor with `consumerGroupSuffix` parameter.
`metadata-jobs/.../test/spring/MCLGMSSpringTest.java`	Changed hook retrieval from `MetadataChangeLogProcessor` to `MCLKafkaListenerRegistrar`.
`metadata-jobs/.../test/spring/MCLMAESpringTest.java`	Modified hook management to utilize `MCLKafkaListenerRegistrar`.
`metadata-jobs/.../test/spring/MCLSpringCommonTestConfiguration.java`	Updated `@ComponentScan` to reflect new package structure.
`metadata-service/configuration/application.yaml`	Introduced `consumerGroupSuffix` properties for various hooks.
`metadata-service/factories/.../KafkaEventConsumerFactory.java`	Renamed `createInstance` method to `kafkaEventConsumer` for clarity.
`metadata-service/war/.../CommonApplicationConfig.java`	Expanded `@ComponentScan` to include new packages for change management and event handling.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant MCLKafkaListenerRegistrar
    participant MCLKafkaListener
    participant MetadataChangeLogHook

    Client->>MCLKafkaListenerRegistrar: Register Kafka Listener
    MCLKafkaListenerRegistrar->>MCLKafkaListener: Create Listener
    MCLKafkaListener->>MetadataChangeLogHook: Invoke Hooks on Data Change
    MetadataChangeLogHook-->>MCLKafkaListener: Processed Data
    MCLKafkaListener-->>Client: Acknowledge Processing

🐇 In the meadow, hops a rabbit so spry,
With changes afoot, he leaps up high!
Metadata flows like a stream, oh so bright,
Kafka listeners dance in the soft moonlight.
New hooks and suffixes, a joyful spree,
In code we trust, as happy as can be!
🐇✨

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share

Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai generate interesting stats about this repository and render them as a table.
- @coderabbitai show all the console.log statements in this repository.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

…ifferent consumer groups

coderabbitai

Actionable comments posted: 5

Outside diff range, codebase verification and nitpick comments (9)

metadata-jobs/mae-consumer/src/main/java/com/linkedin/metadata/kafka/hook/MetadataChangeLogHook.java (1)
21-27: Add Javadoc for getConsumerGroupSuffix().

The new method getConsumerGroupSuffix() lacks a detailed Javadoc description. Consider adding more context about the purpose and usage of the consumer group suffix.
  /**
   * Provides a suffix for the consumer group.
   *
+  * This suffix can be used to differentiate consumer groups for parallel processing.
   *
   * @return suffix
   */
metadata-jobs/mae-consumer/src/main/java/com/linkedin/metadata/kafka/MCLKafkaListenerRegistrar.java (1)
113-119: Optimize Consumer Group Name Construction.

The buildConsumerGroupName method constructs consumer group names. Consider using String.format for better readability when constructing strings with multiple components.
return suffix.isEmpty() ? consumerGroupBase : String.format("%s-%s", consumerGroupBase, suffix);
metadata-jobs/mae-consumer/src/main/java/com/linkedin/metadata/kafka/hook/form/FormAssignmentHook.java (1)
Line range hint 92-109: Consider Adding Logging for Hook Invocation.

In the invoke method, consider adding logging to track when hooks are invoked and with what events. This can aid in debugging and monitoring the hook's behavior.
log.info("Invoking FormAssignmentHook for event: {}", event);
metadata-jobs/mae-consumer/src/main/java/com/linkedin/metadata/kafka/hook/ingestion/IngestionSchedulerHook.java (1)
Line range hint 62-83: Consider Adding Logging for Scheduler Invocation.

In the invoke method, consider adding logging to track when the scheduler is invoked and with what events. This can aid in debugging and monitoring the scheduler's behavior.
log.info("Invoking IngestionSchedulerHook for event: {}", event);
metadata-jobs/mae-consumer/src/main/java/com/linkedin/metadata/kafka/hook/event/EntityChangeEventGeneratorHook.java (2)
87-87: Add documentation for consumerGroupSuffix.

Consider adding a brief comment explaining the purpose of the consumerGroupSuffix field to improve code readability and maintainability.
@Getter private final String consumerGroupSuffix; // Suffix for consumer group configuration
95-96: Configuration properties are not defined.

The configuration properties entityChangeEvents.enabled and entityChangeEvents.consumerGroupSuffix are not defined in any configuration files within the codebase. This could lead to runtime issues if these properties are expected to be set externally. Please ensure they are correctly defined in the appropriate configuration files.

Analysis chain

Validate configuration properties.

Ensure that the configuration properties entityChangeEvents.enabled and entityChangeEvents.consumerGroupSuffix are correctly defined in the configuration files.
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify that the configuration properties are defined.

# Test: Search for the properties in configuration files.
# Expect: Both properties should be present.
rg 'entityChangeEvents\.enabled|entityChangeEvents\.consumerGroupSuffix' --glob "**/resources/**"
Length of output: 99

Script:
#!/bin/bash
# Description: Verify that the configuration properties are defined anywhere in the codebase.

# Test: Search for the properties in all files.
# Expect: Both properties should be present somewhere in the codebase.
rg 'entityChangeEvents\.enabled|entityChangeEvents\.consumerGroupSuffix'
Length of output: 482
metadata-jobs/mae-consumer/src/main/java/com/linkedin/metadata/kafka/hook/incident/IncidentsSummaryHook.java (2)
62-67: Add comments for renamed fields.

Adding comments for the renamed fields incidentService, isEnabled, and maxIncidentHistory can improve code readability and help future maintainers understand their purpose.
private final IncidentService incidentService; // Service for handling incidents
private final boolean isEnabled; // Flag to enable or disable the hook
private final int maxIncidentHistory; // Maximum number of incidents to keep in history
74-74: Configuration property incidents.hook.consumerGroupSuffix is not defined.

The configuration property incidents.hook.consumerGroupSuffix is used in IncidentsSummaryHook.java but is not defined in any configuration files within the repository. Please ensure it is added to the appropriate configuration file to avoid potential runtime issues.

Analysis chain

Ensure configuration property is defined.

Verify that the configuration property incidents.hook.consumerGroupSuffix is correctly defined in the configuration files.
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify that the configuration property is defined.

# Test: Search for the property in configuration files.
# Expect: Property should be present.
rg 'incidents\.hook\.consumerGroupSuffix' --glob "**/resources/**"
Length of output: 68

Script:
#!/bin/bash
# Description: Search for the configuration property across the entire repository.

# Test: Search for the property in all files.
# Expect: Property should be present somewhere in the repository.
rg 'incidents\.hook\.consumerGroupSuffix'
Length of output: 245
metadata-jobs/mae-consumer/src/main/java/com/linkedin/metadata/kafka/hook/siblings/SiblingAssociationHook.java (1)
74-76: Add comments for renamed fields.

Adding comments for the renamed field isEnabled and the new field consumerGroupSuffix can improve code readability and help future maintainers understand their purpose.
private final boolean isEnabled; // Flag to enable or disable the hook
@Getter private final String consumerGroupSuffix; // Suffix for consumer group configuration

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 78336c9 and f1187c0.

Files selected for processing (17)

metadata-jobs/mae-consumer-job/src/main/java/com/linkedin/metadata/kafka/MaeConsumerApplication.java (2 hunks)
metadata-jobs/mae-consumer/src/main/java/com/linkedin/metadata/kafka/MCLKafkaListener.java (1 hunks)
metadata-jobs/mae-consumer/src/main/java/com/linkedin/metadata/kafka/MCLKafkaListenerRegistrar.java (1 hunks)
metadata-jobs/mae-consumer/src/main/java/com/linkedin/metadata/kafka/hook/MetadataChangeLogHook.java (1 hunks)
metadata-jobs/mae-consumer/src/main/java/com/linkedin/metadata/kafka/hook/UpdateIndicesHook.java (3 hunks)
metadata-jobs/mae-consumer/src/main/java/com/linkedin/metadata/kafka/hook/event/EntityChangeEventGeneratorHook.java (5 hunks)
metadata-jobs/mae-consumer/src/main/java/com/linkedin/metadata/kafka/hook/form/FormAssignmentHook.java (5 hunks)
metadata-jobs/mae-consumer/src/main/java/com/linkedin/metadata/kafka/hook/incident/IncidentsSummaryHook.java (9 hunks)
metadata-jobs/mae-consumer/src/main/java/com/linkedin/metadata/kafka/hook/ingestion/IngestionSchedulerHook.java (4 hunks)
metadata-jobs/mae-consumer/src/main/java/com/linkedin/metadata/kafka/hook/siblings/SiblingAssociationHook.java (3 hunks)
metadata-jobs/mae-consumer/src/test/java/com/linkedin/metadata/kafka/hook/spring/MCLGMSSpringTest.java (2 hunks)
metadata-jobs/mae-consumer/src/test/java/com/linkedin/metadata/kafka/hook/spring/MCLMAESpringTest.java (2 hunks)
metadata-jobs/mae-consumer/src/test/java/com/linkedin/metadata/kafka/hook/spring/MCLSpringCommonTestConfiguration.java (1 hunks)
metadata-jobs/pe-consumer/src/main/java/com/datahub/event/PlatformEventProcessor.java (3 hunks)
metadata-service/configuration/src/main/resources/application.yaml (3 hunks)
metadata-service/factories/src/main/java/com/linkedin/gms/factory/kafka/KafkaEventConsumerFactory.java (1 hunks)
metadata-service/war/src/main/java/com/linkedin/gms/CommonApplicationConfig.java (1 hunks)

Files skipped from review due to trivial changes (1)

metadata-service/factories/src/main/java/com/linkedin/gms/factory/kafka/KafkaEventConsumerFactory.java

Additional comments not posted (36)

metadata-service/war/src/main/java/com/linkedin/gms/CommonApplicationConfig.java (1)
41-43: Verify the impact of new packages in @ComponentScan.

The addition of "com.linkedin.gms.factory.change", "com.datahub.event.hook", and "com.linkedin.gms.factory.notifications" to the @ComponentScan annotation expands the component scanning range. Ensure that these packages do not introduce any unintended beans or configurations that could affect the application context.

Verification successful

Impact of New Packages in @ComponentScan Verified

The BusinessAttributeUpdateHook class in the com.datahub.event.hook package is annotated with @Component, indicating it is a Spring bean. This confirms that the addition of this package to @ComponentScan introduces new beans into the application context. Ensure that the behavior of this component is as expected and does not introduce any unintended side effects.

Class Identified: BusinessAttributeUpdateHook in com.datahub.event.hook

Annotation: @Component
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify the presence and impact of new packages in the component scan.

# Test: List all classes in the newly added packages to check for unintended beans.
fd --type f --extension java --exec rg --files-with-matches 'class ' {} \;
Length of output: 212798

Script:
#!/bin/bash
# Search for class definitions in the newly added packages to check for beans or components.
# Packages to check: com.linkedin.gms.factory.change, com.datahub.event.hook, com.linkedin.gms.factory.notifications

# List all Java files in the specified packages
fd --type f --extension java | grep -E 'com/linkedin/gms/factory/change|com/datahub/event/hook|com/linkedin/gms/factory/notifications' | xargs -I {} rg 'class ' {}
Length of output: 326

Script:
#!/bin/bash
# Search for class definitions and Spring annotations in the newly added packages to check for beans or components.
# Packages to check: com.linkedin.gms.factory.change, com.datahub.event.hook, com.linkedin.gms.factory.notifications

# Find Java files in the specified packages and search for class definitions with Spring annotations
fd --type f --extension java | grep -E 'com/linkedin/gms/factory/change|com/datahub/event/hook|com/linkedin/gms/factory/notifications' | xargs -I {} rg -A 5 '@Component|@Service|@Configuration' {}
Length of output: 445
metadata-jobs/mae-consumer-job/src/main/java/com/linkedin/metadata/kafka/MaeConsumerApplication.java (1)

36-38: Assess the effect of new packages in @ComponentScan.

The inclusion of "com.linkedin.gms.factory.change", "com.datahub.event.hook", and "com.linkedin.gms.factory.notifications" in the @ComponentScan annotation suggests a shift towards enhanced event handling. Verify that these changes align with the application's objectives and do not introduce conflicts or redundant beans.
metadata-jobs/mae-consumer/src/test/java/com/linkedin/metadata/kafka/hook/spring/MCLGMSSpringTest.java (2)
38-39: Ensure correct bean retrieval for MCLKafkaListenerRegistrar.

The retrieval of the MCLKafkaListenerRegistrar bean from the application context is crucial for the test. Verify that the application context is correctly configured to provide this bean.

Verification successful

Bean Configuration Likely Correct via Auto-Configuration

The use of @EnableAutoConfiguration in the test files suggests that Spring Boot's auto-configuration is likely handling the MCLKafkaListenerRegistrar bean setup. This mechanism typically configures beans based on the classpath and other settings, indicating that the application context should be correctly providing this bean.
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify the configuration of `MCLKafkaListenerRegistrar` in the application context.

# Test: Search for the bean definition in the codebase. Expect: Correct configuration of the registrar bean.
rg --type java 'MCLKafkaListenerRegistrar'
Length of output: 1175

Script:
#!/bin/bash
# Description: Search for Spring configuration classes or XML files that define the MCLKafkaListenerRegistrar bean.

# Test: Look for @Bean annotation or XML bean definitions for MCLKafkaListenerRegistrar.
rg '@Bean.*MCLKafkaListenerRegistrar|<bean.*MCLKafkaListenerRegistrar' --type java --type xml
Length of output: 95

Script:
#!/bin/bash
# Description: Search for component scanning configurations that might include MCLKafkaListenerRegistrar as a bean.

# Test: Look for @ComponentScan or similar annotations that might automatically configure MCLKafkaListenerRegistrar.
rg '@ComponentScan|@SpringBootApplication|@EnableAutoConfiguration' --type java
Length of output: 3894
41-54: Verify hook assertions for MCLKafkaListenerRegistrar.

The assertions have been updated to reflect the new source of hooks. Ensure that these assertions are accurate and align with the expected behavior of the MCLKafkaListenerRegistrar.

Verification successful

Hook Assertions Verified for MCLKafkaListenerRegistrar.

The getMetadataChangeLogHooks method in MCLKafkaListenerRegistrar correctly filters and sorts hooks, aligning with the assertions in the test file. The assertions are valid and reflect the expected behavior of the registrar.
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify that the `MCLKafkaListenerRegistrar` provides the expected hooks.

# Test: Search for the implementation details of `getMetadataChangeLogHooks`. Expect: Correct hook instances.
ast-grep --lang java --pattern $'class MCLKafkaListenerRegistrar {
  $$$
  List<MetadataChangeLogHook> getMetadataChangeLogHooks() {
    $$$
  }
  $$$  
}'
Length of output: 156

Script:
#!/bin/bash
# Check if the MCLKafkaListenerRegistrar class exists in the codebase
ast-grep --lang java --pattern 'class MCLKafkaListenerRegistrar { $$$ }'

# If the class exists, check for the getMetadataChangeLogHooks method
ast-grep --lang java --pattern 'List<MetadataChangeLogHook> getMetadataChangeLogHooks() { $$$ }'
Length of output: 173

Script:
#!/bin/bash
# Search for the MCLKafkaListenerRegistrar class in the entire codebase
rg 'class MCLKafkaListenerRegistrar' -A 10 --glob '*.java'

# If the class is found, list all methods within it
ast-grep --lang java --pattern 'class MCLKafkaListenerRegistrar { $$$ }' --json | jq '.captures[] | .text' | xargs -I {} ast-grep --lang java --pattern '$_() { $$$ }' {}
Length of output: 1783

Script:
#!/bin/bash
# Search for all method definitions in the MCLKafkaListenerRegistrar class file
rg '^[ \t]*public .* MCLKafkaListenerRegistrar' -A 50 --glob 'metadata-jobs/mae-consumer/src/main/java/com/linkedin/metadata/kafka/MCLKafkaListenerRegistrar.java'
Length of output: 7190
metadata-jobs/mae-consumer/src/test/java/com/linkedin/metadata/kafka/hook/spring/MCLMAESpringTest.java (2)

36-37: Ensure correct bean retrieval for MCLKafkaListenerRegistrar.

The retrieval of the MCLKafkaListenerRegistrar bean from the application context is crucial for the test. Verify that the application context is correctly configured to provide this bean.

39-52: Verify hook assertions for MCLKafkaListenerRegistrar.

The assertions have been updated to reflect the new source of hooks. Ensure that these assertions are accurate and align with the expected behavior of the MCLKafkaListenerRegistrar.
metadata-jobs/mae-consumer/src/main/java/com/linkedin/metadata/kafka/hook/UpdateIndicesHook.java (3)
55-60: Ensure testability with @VisibleForTesting constructor.

The constructor annotated with @VisibleForTesting facilitates testing by omitting the consumerGroupSuffix. Ensure that this constructor is used appropriately in test cases.

Verification successful

Testability confirmed for @VisibleForTesting constructor.

The @VisibleForTesting constructor of UpdateIndicesHook is appropriately used in the UpdateIndicesHookTest.java file, ensuring its intended purpose for testing. No further action is needed.
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify the usage of the `@VisibleForTesting` constructor in test cases.

# Test: Search for the usage of the `@VisibleForTesting` constructor. Expect: Correct usage in test files.
rg --type java 'new UpdateIndicesHook\('
Length of output: 588
40-40: Ensure proper usage of @Getter for consumerGroupSuffix.

The @Getter annotation from Lombok is used to generate a getter for consumerGroupSuffix. Ensure that this field is accessed correctly throughout the codebase.

Verification successful

Proper usage of @Getter for consumerGroupSuffix verified.

The @Getter annotation for consumerGroupSuffix in UpdateIndicesHook is correctly used. This field is accessed via the getConsumerGroupSuffix method, as required by the MetadataChangeLogHook interface, and is used appropriately in the codebase.

UpdateIndicesHook implements MetadataChangeLogHook, ensuring the method is correctly defined and used.
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify the usage of `consumerGroupSuffix` getter in the codebase.

# Test: Search for the usage of `getConsumerGroupSuffix`. Expect: Correct usage of the getter method.
rg --type java 'getConsumerGroupSuffix'
Length of output: 367

Script:
#!/bin/bash
# Check the context of `getConsumerGroupSuffix` usage in MCLKafkaListenerRegistrar.java
rg --type java -A 5 -B 5 'getConsumerGroupSuffix' metadata-jobs/mae-consumer/src/main/java/com/linkedin/metadata/kafka/MCLKafkaListenerRegistrar.java

# Check the context of `getConsumerGroupSuffix` usage in MetadataChangeLogHook.java
rg --type java -A 5 -B 5 'getConsumerGroupSuffix' metadata-jobs/mae-consumer/src/main/java/com/linkedin/metadata/kafka/hook/MetadataChangeLogHook.java
Length of output: 955

Script:
#!/bin/bash
# Check if UpdateIndicesHook implements or extends MetadataChangeLogHook or any related interface/abstract class
rg --type java 'class UpdateIndicesHook' -A 5 metadata-jobs/mae-consumer/src/main/java/com/linkedin/metadata/kafka/hook/UpdateIndicesHook.java
Length of output: 401
42-53: Verify constructor parameter integration for consumerGroupSuffix.

The constructor now includes a consumerGroupSuffix parameter. Ensure that this parameter is correctly passed and utilized within the class.
metadata-jobs/mae-consumer/src/test/java/com/linkedin/metadata/kafka/hook/spring/MCLSpringCommonTestConfiguration.java (1)
37-43: Review the impact of modified component scanning.

The changes to the @ComponentScan annotation have expanded the scope of component scanning to include additional packages. Ensure that the newly included packages contain the necessary Spring-managed beans and that there are no conflicts or unintended dependencies introduced.

Verification successful

No impact from modified component scanning due to absence of files.

The specified directories for the expanded component scanning do not contain any Java files. Therefore, the modification to the @ComponentScan annotation does not introduce any new Spring-managed beans or dependencies.

Directories checked:

metadata-jobs/mae-consumer/src/main/java/com/linkedin/metadata/dao/producer

metadata-jobs/mae-consumer/src/main/java/com/linkedin/gms/factory/change

metadata-jobs/mae-consumer/src/main/java/com/datahub/event/hook

metadata-jobs/mae-consumer/src/main/java/com/linkedin/gms/factory/notifications
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify the presence of Spring-managed beans in the newly included packages.

# Test: Search for @Component, @Service, @Repository, and @Configuration annotations in the new packages.
# Expect: Relevant Spring-managed beans should be present.
rg --type java '@Component|@Service|@Repository|@Configuration' -g 'metadata-jobs/mae-consumer/src/main/java/com/linkedin/metadata/dao/producer/**' -g 'metadata-jobs/mae-consumer/src/main/java/com/linkedin/gms/factory/change/**' -g 'metadata-jobs/mae-consumer/src/main/java/com/datahub/event/hook/**' -g 'metadata-jobs/mae-consumer/src/main/java/com/linkedin/gms/factory/notifications/**'
Length of output: 539

Script:
#!/bin/bash
# List all Java files in the specified directories to verify their existence.
fd --type f --extension java metadata-jobs/mae-consumer/src/main/java/com/linkedin/metadata/dao/producer
fd --type f --extension java metadata-jobs/mae-consumer/src/main/java/com/linkedin/gms/factory/change
fd --type f --extension java metadata-jobs/mae-consumer/src/main/java/com/datahub/event/hook
fd --type f --extension java metadata-jobs/mae-consumer/src/main/java/com/linkedin/gms/factory/notifications
Length of output: 416
metadata-jobs/pe-consumer/src/main/java/com/datahub/event/PlatformEventProcessor.java (1)

48-52: Logging enhancement for platform hooks initialization.

The added logging statement provides improved visibility into which platform hooks are enabled during the initialization of PlatformEventProcessor. This can aid in debugging and operational monitoring.
metadata-jobs/mae-consumer/src/main/java/com/linkedin/metadata/kafka/MCLKafkaListener.java (4)
36-39: Logging initialization of MCL hooks.

The logging statement in the constructor provides insight into the enabled MCL hooks for a given consumer group, which is useful for debugging and operational transparency.

42-54: Comprehensive logging for message consumption.

The detailed logging in the consume method provides valuable information about the Kafka message being processed, including consumer group, key, topic, partition, offset, and timestamp. This is beneficial for debugging and monitoring message flow.

57-67: Error handling during message deserialization.

The error handling mechanism for message deserialization is appropriate, as it logs the error and increments a failure counter. However, consider whether additional actions, such as alerting or retry mechanisms, are needed based on the criticality of the messages.

77-94: Error handling for hook invocation.

The error handling for hook invocation is designed to skip failed hooks and continue processing. This approach ensures that a single failure does not block the processing of other hooks, but it may lead to data inconsistency if not handled properly. Consider whether additional logging or alerting is required for failed hooks.

Verification successful

Consider Additional Alerting for Hook Invocation Failures

The current implementation logs errors for failed hook invocations using log.error. However, if these failures are critical, you might want to implement additional alerting mechanisms to ensure timely awareness and response to such issues.

Location to Review: metadata-jobs/mae-consumer/src/main/java/com/linkedin/metadata/kafka/MCLKafkaListener.java (Lines 77-94)
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify if there are additional logging or alerting mechanisms for hook invocation failures.

# Test: Search for logging or alerting logic related to hook invocation failures.
# Expect: Mechanisms should be in place if critical.
rg --type java 'log.error|alert' -A 5
Length of output: 396810
metadata-jobs/mae-consumer/src/main/java/com/linkedin/metadata/kafka/MCLKafkaListenerRegistrar.java (2)
1-32: Ensure Proper Configuration and Initialization.

The class MCLKafkaListenerRegistrar is responsible for setting up Kafka listeners for Metadata Change Log hooks. Ensure that all configurations, such as consumerGroupBase, mclVersionedTopicName, and mclTimeseriesTopicName, are correctly set in the application properties to avoid runtime issues.

92-111: Validate Kafka Listener Endpoint Configuration.

The createListenerEndpoint method sets up the Kafka listener endpoint. Ensure that the method reference MCLKafkaListener.class.getMethod("consume", ConsumerRecord.class) is correct and that the MCLKafkaListener class implements the consume method with the expected signature.

Verification successful

Kafka Listener Endpoint Configuration Verified

The consume method in the MCLKafkaListener class exists with the correct signature. The method reference in createListenerEndpoint is valid and correctly configured.

Method: void consume(ConsumerRecord<String, GenericRecord> consumerRecord)
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Verify that the `consume` method exists in the `MCLKafkaListener` class with the correct signature.

# Search for the method definition in the MCLKafkaListener class.
ast-grep --lang java --pattern $'class MCLKafkaListener {
  $$$
  void consume(ConsumerRecord<String, GenericRecord> record) {
    $$$
  }
  $$$
}'
Length of output: 148

Script:
#!/bin/bash
# Find the file containing the MCLKafkaListener class and list its methods to verify the consume method.

# Locate the file containing the MCLKafkaListener class.
fd 'MCLKafkaListener.java' --exec cat {} +

# List all methods in the MCLKafkaListener class to verify the consume method.
ast-grep --lang java --pattern $'class MCLKafkaListener {
  $$$
  $_() {
    $$$
  }
  $$$
}'
Length of output: 4334
metadata-jobs/mae-consumer/src/main/java/com/linkedin/metadata/kafka/hook/form/FormAssignmentHook.java (4)

58-62: Field Renaming for Consistency.

The fields formService and isEnabled have been renamed for consistency with Java naming conventions. This improves code readability and maintainability.

74-76: Testing Constructor Provides Flexibility.

The addition of a testing constructor that defaults consumerGroupSuffix to an empty string is a good practice. It facilitates easier unit testing by reducing the dependencies required for instantiation.

87-87: Simplified isEnabled Method.

The isEnabled method now directly returns the isEnabled field, which simplifies the logic and improves readability.

67-72: Verify Configuration for Consumer Group Suffix.

The constructor now requires a consumerGroupSuffix parameter. Ensure that the configuration property ${forms.hook.consumerGroupSuffix} is correctly set in the application properties to avoid runtime issues.

metadata-jobs/mae-consumer/src/main/java/com/linkedin/metadata/kafka/hook/ingestion/IngestionSchedulerHook.java (4)

33-36: Field Renaming for Consistency.

The fields scheduler and isEnabled have been renamed for consistency with Java naming conventions. This improves code readability and maintainability.

48-51: Testing Constructor Provides Flexibility.

The addition of a testing constructor that defaults consumerGroupSuffix to an empty string is a good practice. It facilitates easier unit testing by reducing the dependencies required for instantiation.

56-56: Simplified isEnabled Method.

The isEnabled method now directly returns the isEnabled field, which simplifies the logic and improves readability.

41-46: Verify Configuration for Consumer Group Suffix.

The constructor now requires a consumerGroupSuffix parameter. Ensure that the configuration property ${ingestionScheduler.consumerGroupSuffix} is correctly set in the application properties to avoid runtime issues.
metadata-jobs/mae-consumer/src/main/java/com/linkedin/metadata/kafka/hook/event/EntityChangeEventGeneratorHook.java (2)
116-116: Confirm logical consistency of isEnabled().

The isEnabled() method now returns the isEnabled variable. Ensure that this change maintains the intended logic throughout the codebase.

Verification successful

Logical Consistency Verified for isEnabled() Method

The isEnabled() method in EntityChangeEventGeneratorHook.java correctly returns the isEnabled variable, which is initialized based on an external configuration with a default value of true. This ensures logical consistency with its intended purpose. No further issues were found.
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify logical consistency of `isEnabled()` usage.

# Test: Search for all usages of `isEnabled()` method.
# Expect: Consistent logic with previous implementation.
rg --type java 'isEnabled'
Length of output: 5537

Script:
#!/bin/bash
# Find the definition and usage of the `isEnabled` variable within the EntityChangeEventGeneratorHook.java file.
fd 'EntityChangeEventGeneratorHook.java' -x rg 'isEnabled' {}
Length of output: 395
3-3: Ensure proper usage of @VisibleForTesting.

The @VisibleForTesting annotation is used to indicate that the constructor is primarily for testing purposes. Ensure that this constructor is only used in test scenarios to maintain code clarity and prevent misuse.
metadata-jobs/mae-consumer/src/main/java/com/linkedin/metadata/kafka/hook/incident/IncidentsSummaryHook.java (1)
97-97: Confirm logical consistency of isEnabled().

The isEnabled() method now returns the isEnabled variable. Ensure that this change maintains the intended logic throughout the codebase.

Verification successful

Logical consistency of isEnabled() is confirmed. The isEnabled() method in IncidentsSummaryHook correctly returns the isEnabled field, which is initialized via configuration and used to control the execution of the invoke method. This maintains the intended logic.
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify logical consistency of `isEnabled()` usage.

# Test: Search for all usages of `isEnabled()` method.
# Expect: Consistent logic with previous implementation.
rg --type java 'isEnabled'
Length of output: 5537

Script:
#!/bin/bash
# Search for the definition and usage of `isEnabled` within the `IncidentsSummaryHook` class.
ast-grep --lang java --pattern 'class IncidentsSummaryHook {
  $$$
  boolean isEnabled;
  $$$
}' 

# Search for the `isEnabled()` method implementation in `IncidentsSummaryHook`.
ast-grep --lang java --pattern 'class IncidentsSummaryHook {
  $$$
  public boolean isEnabled() {
    return $_;
  }
  $$$
}'
Length of output: 226

Script:
#!/bin/bash
# Search for the class definition of `IncidentsSummaryHook` to understand its structure.
rg --type java 'class IncidentsSummaryHook' -A 20

# Search for the `isEnabled` field and method within `IncidentsSummaryHook` more broadly.
rg --type java 'isEnabled' metadata-jobs/mae-consumer/src/main/java/com/linkedin/metadata/kafka/hook/incident/IncidentsSummaryHook.java -A 5 -B 5
Length of output: 8808
metadata-jobs/mae-consumer/src/main/java/com/linkedin/metadata/kafka/hook/siblings/SiblingAssociationHook.java (2)
82-83: Ensure configuration property is defined.

Verify that the configuration property siblings.consumerGroupSuffix is correctly defined in the configuration files.

114-114: Confirm logical consistency of isEnabled().

The isEnabled() method now returns the isEnabled variable. Ensure that this change maintains the intended logic throughout the codebase.

Verification successful

Logical consistency of isEnabled() is maintained in SiblingAssociationHook. The isEnabled() method returns the isEnabled variable, and its usage in the invoke method is consistent with its intended purpose of conditionally executing logic based on the enabled state. No issues found.
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify logical consistency of `isEnabled()` usage.

# Test: Search for all usages of `isEnabled()` method.
# Expect: Consistent logic with previous implementation.
rg --type java 'isEnabled'
Length of output: 5537

Script:
#!/bin/bash
# Extract the implementation of the `isEnabled()` method in SiblingAssociationHook
ast-grep --lang java --pattern $'class SiblingAssociationHook {
  $$$
  public boolean isEnabled() {
    $$$
  }
  $$$
}' 
Length of output: 122

Script:
#!/bin/bash
# Locate and extract the `isEnabled()` method implementation in SiblingAssociationHook
rg -A 10 'public boolean isEnabled' metadata-jobs/mae-consumer/src/main/java/com/linkedin/metadata/kafka/hook/siblings/SiblingAssociationHook.java
Length of output: 442
metadata-service/configuration/src/main/resources/application.yaml (6)

299-299: LGTM!

The addition of consumerGroupSuffix to the siblings section enhances configurability for Kafka consumer groups.

302-302: LGTM!

The addition of consumerGroupSuffix to the updateIndices section enhances configurability for Kafka consumer groups.

305-305: LGTM!

The addition of consumerGroupSuffix to the ingestionScheduler section enhances configurability for Kafka consumer groups.

306-310: LGTM!

The new properties in the incidents.hook section provide enhanced configurability for incident management and Kafka consumer groups.

387-387: LGTM!

The addition of consumerGroupSuffix to the entityChangeEvents section enhances configurability for Kafka consumer groups.

472-472: LGTM!

The addition of consumerGroupSuffix to the forms.hook section enhances configurability for Kafka consumer groups.

coderabbitai · 2024-08-09T15:16:40Z

...a-jobs/mae-consumer/src/main/java/com/linkedin/metadata/kafka/MCLKafkaListenerRegistrar.java

+  public void afterPropertiesSet() {
+    Map<String, List<MetadataChangeLogHook>> hookGroups =
+        getMetadataChangeLogHooks().stream()
+            .collect(Collectors.groupingBy(MetadataChangeLogHook::getConsumerGroupSuffix));
+
+    log.info(
+        "MetadataChangeLogProcessor Consumer Groups: {}",
+        hookGroups.keySet().stream().map(this::buildConsumerGroupName).collect(Collectors.toSet()));
+
+    hookGroups.forEach(
+        (key, hooks) -> {
+          KafkaListenerEndpoint kafkaListenerEndpoint =
+              createListenerEndpoint(
+                  buildConsumerGroupName(key),
+                  List.of(mclVersionedTopicName, mclTimeseriesTopicName),
+                  hooks);
+          registerMCLKafkaListener(kafkaListenerEndpoint, true);
+        });
+  }


Consider Error Handling for Hook Registration.

The afterPropertiesSet method registers Kafka listeners for each consumer group. Consider adding error handling to manage potential issues during listener registration, such as invalid configurations or connectivity problems.

try { registerMCLKafkaListener(kafkaListenerEndpoint, true); } catch (Exception e) { log.error("Failed to register Kafka listener for consumer group: {}", key, e); }

coderabbitai · 2024-08-09T15:16:40Z

metadata-jobs/mae-consumer/src/main/java/com/linkedin/metadata/kafka/MCLKafkaListener.java

+              "Failed to execute MCL hook with name {}", hook.getClass().getCanonicalName(), e);
+        }
+      }
+      // TODO: Manually commit kafka offsets after full processing.


Consider implementing manual offset commits.

The TODO comment suggests implementing manual offset commits after full processing. This can ensure that messages are only marked as processed after successful execution of all hooks, reducing the risk of message loss.

Would you like assistance in implementing manual offset commits?

coderabbitai · 2024-08-09T15:16:40Z

...mer/src/main/java/com/linkedin/metadata/kafka/hook/event/EntityChangeEventGeneratorHook.java

+  @VisibleForTesting
+  public EntityChangeEventGeneratorHook(
+      @Nonnull OperationContext systemOperationContext,
+      @Nonnull final EntityChangeEventGeneratorRegistry entityChangeEventGeneratorRegistry,
+      @Nonnull final SystemEntityClient entityClient,
+      @Nonnull Boolean isEnabled) {
+    this(systemOperationContext, entityChangeEventGeneratorRegistry, entityClient, isEnabled, "");


Consider using builder pattern for constructor.

With multiple parameters, especially with optional ones like consumerGroupSuffix, consider using a builder pattern to enhance readability and flexibility.

public static class Builder { private OperationContext systemOperationContext; private EntityChangeEventGeneratorRegistry entityChangeEventGeneratorRegistry; private SystemEntityClient entityClient; private Boolean isEnabled; private String consumerGroupSuffix = ""; public Builder systemOperationContext(OperationContext systemOperationContext) { this.systemOperationContext = systemOperationContext; return this; } public Builder entityChangeEventGeneratorRegistry(EntityChangeEventGeneratorRegistry registry) { this.entityChangeEventGeneratorRegistry = registry; return this; } public Builder entityClient(SystemEntityClient client) { this.entityClient = client; return this; } public Builder isEnabled(Boolean isEnabled) { this.isEnabled = isEnabled; return this; } public Builder consumerGroupSuffix(String suffix) { this.consumerGroupSuffix = suffix; return this; } public EntityChangeEventGeneratorHook build() { return new EntityChangeEventGeneratorHook( systemOperationContext, entityChangeEventGeneratorRegistry, entityClient, isEnabled, consumerGroupSuffix); } }

coderabbitai · 2024-08-09T15:16:40Z

...e-consumer/src/main/java/com/linkedin/metadata/kafka/hook/incident/IncidentsSummaryHook.java

+  @VisibleForTesting
+  public IncidentsSummaryHook(
+      @Nonnull final IncidentService incidentService,
+      @Nonnull Boolean isEnabled,
+      @Nonnull Integer maxIncidentHistory) {
+    this(incidentService, isEnabled, maxIncidentHistory, "");


Consider using builder pattern for constructor.

With multiple parameters, especially with optional ones like consumerGroupSuffix, consider using a builder pattern to enhance readability and flexibility.

public static class Builder { private IncidentService incidentService; private Boolean isEnabled; private Integer maxIncidentHistory; private String consumerGroupSuffix = ""; public Builder incidentService(IncidentService incidentService) { this.incidentService = incidentService; return this; } public Builder isEnabled(Boolean isEnabled) { this.isEnabled = isEnabled; return this; } public Builder maxIncidentHistory(Integer maxIncidentHistory) { this.maxIncidentHistory = maxIncidentHistory; return this; } public Builder consumerGroupSuffix(String suffix) { this.consumerGroupSuffix = suffix; return this; } public IncidentsSummaryHook build() { return new IncidentsSummaryHook(incidentService, isEnabled, maxIncidentHistory, consumerGroupSuffix); } }

coderabbitai · 2024-08-09T15:16:40Z

...consumer/src/main/java/com/linkedin/metadata/kafka/hook/siblings/SiblingAssociationHook.java

+  @VisibleForTesting
+  public SiblingAssociationHook(
+      @Nonnull final SystemEntityClient systemEntityClient,
+      @Nonnull final EntitySearchService searchService,
+      @Nonnull Boolean isEnabled) {
+    this(systemEntityClient, searchService, isEnabled, "");


Consider using builder pattern for constructor.

With multiple parameters, especially with optional ones like consumerGroupSuffix, consider using a builder pattern to enhance readability and flexibility.

public static class Builder { private SystemEntityClient systemEntityClient; private EntitySearchService searchService; private Boolean isEnabled; private String consumerGroupSuffix = ""; public Builder systemEntityClient(SystemEntityClient systemEntityClient) { this.systemEntityClient = systemEntityClient; return this; } public Builder searchService(EntitySearchService searchService) { this.searchService = searchService; return this; } public Builder isEnabled(Boolean isEnabled) { this.isEnabled = isEnabled; return this; } public Builder consumerGroupSuffix(String suffix) { this.consumerGroupSuffix = suffix; return this; } public SiblingAssociationHook build() { return new SiblingAssociationHook(systemEntityClient, searchService, isEnabled, consumerGroupSuffix); } }

RyanHolstien · 2024-08-09T15:39:57Z

I do think a docs update for this would be helpful, but lgtm 😄

coderabbitai

Actionable comments posted: 1

Outside diff range, codebase verification and nitpick comments (3)

docs/how/kafka-config.md (3)

119-122: Clarify the default behavior of MCL hooks.

The section explains the default behavior but could benefit from a clearer introduction to the new feature of separating hooks into different consumer groups.

Consider rephrasing to emphasize the new capability right at the beginning.

131-138: Ensure consistency in descriptions.

The descriptions in the table vary in detail. Ensure each entry provides sufficient context for users to understand its purpose.

Consider adding more context to some descriptions, such as explaining what "primary processing hook" means in this context.

Line range hint 140-151: Clarify Docker and Helm configuration instructions.

The instructions for applying configurations using Docker and Helm are clear but could benefit from additional examples or links to further documentation.

Consider adding links to detailed guides on setting environment variables in Docker and Helm.

Tools

LanguageTool

[uncategorized] ~117-~117: Loose punctuation mark.
Context: ...ae-consumer - KAFKA_CONSUMER_GROUP_ID: The name of the kafka consumer's group ...

(UNLIKELY_OPENING_PUNCTUATION)

[uncategorized] ~124-~124: Possible missing comma found.
Context: ...s could alsp be separated into separate groups which allows for controlling paralleliz...

(AI_HYDRA_LEO_MISSING_COMMA)

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between f1187c0 and 7122214.

Files selected for processing (1)

docs/how/kafka-config.md (1 hunks)

Additional context used

LanguageTool

docs/how/kafka-config.md

[uncategorized] ~124-~124: Possible missing comma found.
Context: ...s could alsp be separated into separate groups which allows for controlling paralleliz...

(AI_HYDRA_LEO_MISSING_COMMA)

coderabbitai · 2024-08-09T16:02:12Z

docs/how/kafka-config.md

+The various MCL Hooks could alsp be separated into separate groups which allows for controlling parallelization and 
+prioritization of the hooks.


Correct the typo and improve clarity.

There's a typo in "alsp" which should be "also." Additionally, a comma is needed for clarity.

- The various MCL Hooks could alsp be separated into separate groups which allows for controlling parallelization and + The various MCL Hooks could also be separated into separate groups, which allows for controlling parallelization and

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

The various MCL Hooks could alsp be separated into separate groups which allows for controlling parallelization and

prioritization of the hooks.

The various MCL Hooks could also be separated into separate groups, which allows for controlling parallelization and

prioritization of the hooks.

Tools

LanguageTool

[uncategorized] ~124-~124: Possible missing comma found.
Context: ...s could alsp be separated into separate groups which allows for controlling paralleliz...

(AI_HYDRA_LEO_MISSING_COMMA)

* feat(forms) Handle deleting forms references when hard deleting forms (datahub-project#10820) * refactor(ui): Misc improvements to the setup ingestion flow (ingest uplift 1/2) (datahub-project#10764) Co-authored-by: John Joyce <john@Johns-MBP.lan> Co-authored-by: John Joyce <john@ip-192-168-1-200.us-west-2.compute.internal> * fix(ingestion/airflow-plugin): pipeline tasks discoverable in search (datahub-project#10819) * feat(ingest/transformer): tags to terms transformer (datahub-project#10758) Co-authored-by: Aseem Bansal <asmbansal2@gmail.com> * fix(ingestion/unity-catalog): fixed issue with profiling with GE turned on (datahub-project#10752) Co-authored-by: Aseem Bansal <asmbansal2@gmail.com> * feat(forms) Add java SDK for form entity PATCH + CRUD examples (datahub-project#10822) * feat(SDK) Add java SDK for structuredProperty entity PATCH + CRUD examples (datahub-project#10823) * feat(SDK) Add StructuredPropertyPatchBuilder in python sdk and provide sample CRUD files (datahub-project#10824) * feat(forms) Add CRUD endpoints to GraphQL for Form entities (datahub-project#10825) * add flag for includeSoftDeleted in scroll entities API (datahub-project#10831) * feat(deprecation) Return actor entity with deprecation aspect (datahub-project#10832) * feat(structuredProperties) Add CRUD graphql APIs for structured property entities (datahub-project#10826) * add scroll parameters to openapi v3 spec (datahub-project#10833) * fix(ingest): correct profile_day_of_week implementation (datahub-project#10818) * feat(ingest/glue): allow ingestion of empty databases from Glue (datahub-project#10666) Co-authored-by: Harshal Sheth <hsheth2@gmail.com> * feat(cli): add more details to get cli (datahub-project#10815) * fix(ingestion/glue): ensure date formatting works on all platforms for aws glue (datahub-project#10836) * fix(ingestion): fix datajob patcher (datahub-project#10827) * fix(smoke-test): add suffix in temp file creation (datahub-project#10841) * feat(ingest/glue): add helper method to permit user or group ownership (datahub-project#10784) * feat(): Show data platform instances in policy modal if they are set on the policy (datahub-project#10645) Co-authored-by: Hendrik Richert <hendrik.richert@swisscom.com> * docs(patch): add patch documentation for how implementation works (datahub-project#10010) Co-authored-by: John Joyce <john@acryl.io> * fix(jar): add missing custom-plugin-jar task (datahub-project#10847) * fix(): also check exceptions/stack trace when filtering log messages (datahub-project#10391) Co-authored-by: John Joyce <john@acryl.io> * docs(): Update posts.md (datahub-project#9893) Co-authored-by: Hyejin Yoon <0327jane@gmail.com> Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> * chore(ingest): update acryl-datahub-classify version (datahub-project#10844) * refactor(ingest): Refactor structured logging to support infos, warnings, and failures structured reporting to UI (datahub-project#10828) Co-authored-by: John Joyce <john@Johns-MBP.lan> Co-authored-by: Harshal Sheth <hsheth2@gmail.com> * fix(restli): log aspect-not-found as a warning rather than as an error (datahub-project#10834) * fix(ingest/nifi): remove duplicate upstream jobs (datahub-project#10849) * fix(smoke-test): test access to create/revoke personal access tokens (datahub-project#10848) * fix(smoke-test): missing test for move domain (datahub-project#10837) * ci: update usernames to not considered for community (datahub-project#10851) * env: change defaults for data contract visibility (datahub-project#10854) * fix(ingest/tableau): quote special characters in external URL (datahub-project#10842) * fix(smoke-test): fix flakiness of auto complete test * ci(ingest): pin dask dependency for feast (datahub-project#10865) * fix(ingestion/lookml): liquid template resolution and view-to-view cll (datahub-project#10542) * feat(ingest/audit): add client id and version in system metadata props (datahub-project#10829) * chore(ingest): Mypy 1.10.1 pin (datahub-project#10867) * docs: use acryl-datahub-actions as expected python package to install (datahub-project#10852) * docs: add new js snippet (datahub-project#10846) * refactor(ingestion): remove company domain for security reason (datahub-project#10839) * fix(ingestion/spark): Platform instance and column level lineage fix (datahub-project#10843) Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> * feat(ingestion/tableau): optionally ingest multiple sites and create site containers (datahub-project#10498) Co-authored-by: Yanik Häni <Yanik.Haeni1@swisscom.com> * fix(ingestion/looker): Add sqlglot dependency and remove unused sqlparser (datahub-project#10874) * fix(manage-tokens): fix manage access token policy (datahub-project#10853) * Batch get entity endpoints (datahub-project#10880) * feat(system): support conditional write semantics (datahub-project#10868) * fix(build): upgrade vercel builds to Node 20.x (datahub-project#10890) * feat(ingest/lookml): shallow clone repos (datahub-project#10888) * fix(ingest/looker): add missing dependency (datahub-project#10876) * fix(ingest): only populate audit stamps where accurate (datahub-project#10604) * fix(ingest/dbt): always encode tag urns (datahub-project#10799) * fix(ingest/redshift): handle multiline alter table commands (datahub-project#10727) * fix(ingestion/looker): column name missing in explore (datahub-project#10892) * fix(lineage) Fix lineage source/dest filtering with explored per hop limit (datahub-project#10879) * feat(conditional-writes): misc updates and fixes (datahub-project#10901) * feat(ci): update outdated action (datahub-project#10899) * feat(rest-emitter): adding async flag to rest emitter (datahub-project#10902) Co-authored-by: Gabe Lyons <gabe.lyons@acryl.io> * feat(ingest): add snowflake-queries source (datahub-project#10835) * fix(ingest): improve `auto_materialize_referenced_tags_terms` error handling (datahub-project#10906) * docs: add new company to adoption list (datahub-project#10909) * refactor(redshift): Improve redshift error handling with new structured reporting system (datahub-project#10870) Co-authored-by: John Joyce <john@Johns-MBP.lan> Co-authored-by: Harshal Sheth <hsheth2@gmail.com> * feat(ui) Finalize support for all entity types on forms (datahub-project#10915) * Index ExecutionRequestResults status field (datahub-project#10811) * feat(ingest): grafana connector (datahub-project#10891) Co-authored-by: Shirshanka Das <shirshanka@apache.org> Co-authored-by: Harshal Sheth <hsheth2@gmail.com> * fix(gms) Add Form entity type to EntityTypeMapper (datahub-project#10916) * feat(dataset): add support for external url in Dataset (datahub-project#10877) * docs(saas-overview) added missing features to observe section (datahub-project#10913) Co-authored-by: John Joyce <john@acryl.io> * fix(ingest/spark): Fixing Micrometer warning (datahub-project#10882) * fix(structured properties): allow application of structured properties without schema file (datahub-project#10918) * fix(data-contracts-web) handle other schedule types (datahub-project#10919) * fix(ingestion/tableau): human-readable message for PERMISSIONS_MODE_SWITCHED error (datahub-project#10866) Co-authored-by: Harshal Sheth <hsheth2@gmail.com> * Add feature flag for view defintions (datahub-project#10914) Co-authored-by: Ethan Cartwright <ethan.cartwright@acryl.io> * feat(ingest/BigQuery): refactor+parallelize dataset metadata extraction (datahub-project#10884) * fix(airflow): add error handling around render_template() (datahub-project#10907) * feat(ingestion/sqlglot): add optional `default_dialect` parameter to sqlglot lineage (datahub-project#10830) * feat(mcp-mutator): new mcp mutator plugin (datahub-project#10904) * fix(ingest/bigquery): changes helper function to decode unicode scape sequences (datahub-project#10845) * feat(ingest/postgres): fetch table sizes for profile (datahub-project#10864) * feat(ingest/abs): Adding azure blob storage ingestion source (datahub-project#10813) * fix(ingest/redshift): reduce severity of SQL parsing issues (datahub-project#10924) * fix(build): fix lint fix web react (datahub-project#10896) * fix(ingest/bigquery): handle quota exceeded for project.list requests (datahub-project#10912) * feat(ingest): report extractor failures more loudly (datahub-project#10908) * feat(ingest/snowflake): integrate snowflake-queries into main source (datahub-project#10905) * fix(ingest): fix docs build (datahub-project#10926) * fix(ingest/snowflake): fix test connection (datahub-project#10927) * fix(ingest/lookml): add view load failures to cache (datahub-project#10923) * docs(slack) overhauled setup instructions and screenshots (datahub-project#10922) Co-authored-by: John Joyce <john@acryl.io> * fix(airflow): Add comma parsing of owners to DataJobs (datahub-project#10903) * fix(entityservice): fix merging sideeffects (datahub-project#10937) * feat(ingest): Support System Ingestion Sources, Show and hide system ingestion sources with Command-S (datahub-project#10938) Co-authored-by: John Joyce <john@Johns-MBP.lan> * chore() Set a default lineage filtering end time on backend when a start time is present (datahub-project#10925) Co-authored-by: John Joyce <john@ip-192-168-1-200.us-west-2.compute.internal> Co-authored-by: John Joyce <john@Johns-MBP.lan> * Added relationships APIs to V3. Added these generic APIs to V3 swagger doc. (datahub-project#10939) * docs: add learning center to docs (datahub-project#10921) * doc: Update hubspot form id (datahub-project#10943) * chore(airflow): add python 3.11 w/ Airflow 2.9 to CI (datahub-project#10941) * fix(ingest/Glue): column upstream lineage between S3 and Glue (datahub-project#10895) * fix(ingest/abs): split abs utils into multiple files (datahub-project#10945) * doc(ingest/looker): fix doc for sql parsing documentation (datahub-project#10883) Co-authored-by: Harshal Sheth <hsheth2@gmail.com> * fix(ingest/bigquery): Adding missing BigQuery types (datahub-project#10950) * fix(ingest/setup): feast and abs source setup (datahub-project#10951) * fix(connections) Harden adding /gms to connections in backend (datahub-project#10942) * feat(siblings) Add flag to prevent combining siblings in the UI (datahub-project#10952) * fix(docs): make graphql doc gen more automated (datahub-project#10953) * feat(ingest/athena): Add option for Athena partitioned profiling (datahub-project#10723) * fix(spark-lineage): default timeout for future responses (datahub-project#10947) * feat(datajob/flow): add environment filter using info aspects (datahub-project#10814) * fix(ui/ingest): correct privilege used to show tab (datahub-project#10483) Co-authored-by: Kunal-kankriya <127090035+Kunal-kankriya@users.noreply.github.com> * feat(ingest/looker): include dashboard urns in browse v2 (datahub-project#10955) * add a structured type to batchGet in OpenAPI V3 spec (datahub-project#10956) * fix(ui): scroll on the domain sidebar to show all domains (datahub-project#10966) * fix(ingest/sagemaker): resolve incorrect variable assignment for SageMaker API call (datahub-project#10965) * fix(airflow/build): Pinning mypy (datahub-project#10972) * Fixed a bug where the OpenAPI V3 spec was incorrect. The bug was introduced in datahub-project#10939. (datahub-project#10974) * fix(ingest/test): Fix for mssql integration tests (datahub-project#10978) * fix(entity-service) exist check correctly extracts status (datahub-project#10973) * fix(structuredProps) casing bug in StructuredPropertiesValidator (datahub-project#10982) * bugfix: use anyOf instead of allOf when creating references in openapi v3 spec (datahub-project#10986) * fix(ui): Remove ant less imports (datahub-project#10988) * feat(ingest/graph): Add get_results_by_filter to DataHubGraph (datahub-project#10987) * feat(ingest/cli): init does not actually support environment variables (datahub-project#10989) * fix(ingest/graph): Update get_results_by_filter graphql query (datahub-project#10991) * feat(ingest/spark): Promote beta plugin (datahub-project#10881) Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> * feat(ingest): support domains in meta -> "datahub" section (datahub-project#10967) * feat(ingest): add `check server-config` command (datahub-project#10990) * feat(cli): Make consistent use of DataHubGraphClientConfig (datahub-project#10466) Deprecates get_url_and_token() in favor of a more complete option: load_graph_config() that returns a full DatahubClientConfig. This change was then propagated across previous usages of get_url_and_token so that connections to DataHub server from the client respect the full breadth of configuration specified by DatahubClientConfig. I.e: You can now specify disable_ssl_verification: true in your ~/.datahubenv file so that all cli functions to the server work when ssl certification is disabled. Fixes datahub-project#9705 * fix(ingest/s3): Fixing container creation when there is no folder in path (datahub-project#10993) * fix(ingest/looker): support platform instance for dashboards & charts (datahub-project#10771) * feat(ingest/bigquery): improve handling of information schema in sql parser (datahub-project#10985) * feat(ingest): improve `ingest deploy` command (datahub-project#10944) * fix(backend): allow excluding soft-deleted entities in relationship-queries; exclude soft-deleted members of groups (datahub-project#10920) - allow excluding soft-deleted entities in relationship-queries - exclude soft-deleted members of groups * fix(ingest/looker): downgrade missing chart type log level (datahub-project#10996) * doc(acryl-cloud): release docs for 0.3.4.x (datahub-project#10984) Co-authored-by: John Joyce <john@acryl.io> Co-authored-by: RyanHolstien <RyanHolstien@users.noreply.github.com> Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Co-authored-by: Pedro Silva <pedro@acryl.io> * fix(protobuf/build): Fix protobuf check jar script (datahub-project#11006) * fix(ui/ingest): Support invalid cron jobs (datahub-project#10998) * fix(ingest): fix graph config loading (datahub-project#11002) Co-authored-by: Pedro Silva <pedro@acryl.io> * feat(docs): Document __DATAHUB_TO_FILE_ directive (datahub-project#10968) Co-authored-by: Harshal Sheth <hsheth2@gmail.com> * fix(graphql/upsertIngestionSource): Validate cron schedule; parse error in CLI (datahub-project#11011) * feat(ece): support custom ownership type urns in ECE generation (datahub-project#10999) * feat(assertion-v2): changed Validation tab to Quality and created new Governance tab (datahub-project#10935) * fix(ingestion/glue): Add support for missing config options for profiling in Glue (datahub-project#10858) * feat(propagation): Add models for schema field docs, tags, terms (datahub-project#2959) (datahub-project#11016) Co-authored-by: Chris Collins <chriscollins3456@gmail.com> * docs: standardize terminology to DataHub Cloud (datahub-project#11003) * fix(ingestion/transformer): replace the externalUrl container (datahub-project#11013) * docs(slack) troubleshoot docs (datahub-project#11014) * feat(propagation): Add graphql API (datahub-project#11030) Co-authored-by: Chris Collins <chriscollins3456@gmail.com> * feat(propagation): Add models for Action feature settings (datahub-project#11029) * docs(custom properties): Remove duplicate from sidebar (datahub-project#11033) * feat(models): Introducing Dataset Partitions Aspect (datahub-project#10997) Co-authored-by: John Joyce <john@Johns-MBP.lan> Co-authored-by: John Joyce <john@ip-192-168-1-200.us-west-2.compute.internal> * feat(propagation): Add Documentation Propagation Settings (datahub-project#11038) * fix(models): chart schema fields mapping, add dataHubAction entity, t… (datahub-project#11040) * fix(ci): smoke test lint failures (datahub-project#11044) * docs: fix learning center color scheme & typo (datahub-project#11043) * feat: add cloud main page (datahub-project#11017) Co-authored-by: Jay <159848059+jayacryl@users.noreply.github.com> * feat(restore-indices): add additional step to also clear system metadata service (datahub-project#10662) Co-authored-by: John Joyce <john@acryl.io> * docs: fix typo (datahub-project#11046) * fix(lint): apply spotless (datahub-project#11050) * docs(airflow): example query to get datajobs for a dataflow (datahub-project#11034) * feat(cli): Add run-id option to put sub-command (datahub-project#11023) Adds an option to assign run-id to a given put command execution. This is useful when transformers do not exist for a given ingestion payload, we can follow up with custom metadata and assign it to an ingestion pipeline. * fix(ingest): improve sql error reporting calls (datahub-project#11025) * fix(airflow): fix CI setup (datahub-project#11031) * feat(ingest/dbt): add experimental `prefer_sql_parser_lineage` flag (datahub-project#11039) * fix(ingestion/lookml): enable stack-trace in lookml logs (datahub-project#10971) * (chore): Linting fix (datahub-project#11015) * chore(ci): update deprecated github actions (datahub-project#10977) * Fix ALB configuration example (datahub-project#10981) * chore(ingestion-base): bump base image packages (datahub-project#11053) * feat(cli): Trim report of dataHubExecutionRequestResult to max GMS size (datahub-project#11051) * fix(ingestion/lookml): emit dummy sql condition for lookml custom condition tag (datahub-project#11008) Co-authored-by: Harshal Sheth <hsheth2@gmail.com> * fix(ingestion/powerbi): fix issue with broken report lineage (datahub-project#10910) * feat(ingest/tableau): add retry on timeout (datahub-project#10995) * change generate kafka connect properties from env (datahub-project#10545) Co-authored-by: david-leifker <114954101+david-leifker@users.noreply.github.com> * fix(ingest): fix oracle cronjob ingestion (datahub-project#11001) Co-authored-by: david-leifker <114954101+david-leifker@users.noreply.github.com> * chore(ci): revert update deprecated github actions (datahub-project#10977) (datahub-project#11062) * feat(ingest/dbt-cloud): update metadata_endpoint inference (datahub-project#11041) * build: Reduce size of datahub-frontend-react image by 50-ish% (datahub-project#10878) Co-authored-by: david-leifker <114954101+david-leifker@users.noreply.github.com> * fix(ci): Fix lint issue in datahub_ingestion_run_summary_provider.py (datahub-project#11063) * docs(ingest): update developing-a-transformer.md (datahub-project#11019) * feat(search-test): update search tests from datahub-project#10408 (datahub-project#11056) * feat(cli): add aspects parameter to DataHubGraph.get_entity_semityped (datahub-project#11009) Co-authored-by: Harshal Sheth <hsheth2@gmail.com> * docs(airflow): update min version for plugin v2 (datahub-project#11065) * doc(ingestion/tableau): doc update for derived permission (datahub-project#11054) Co-authored-by: Pedro Silva <pedro.cls93@gmail.com> Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Co-authored-by: Harshal Sheth <hsheth2@gmail.com> * fix(py): remove dep on types-pkg_resources (datahub-project#11076) * feat(ingest/mode): add option to exclude restricted (datahub-project#11081) * fix(ingest): set lastObserved in sdk when unset (datahub-project#11071) * doc(ingest): Update capabilities (datahub-project#11072) * chore(vulnerability): Log Injection (datahub-project#11090) * chore(vulnerability): Information exposure through a stack trace (datahub-project#11091) * chore(vulnerability): Comparison of narrow type with wide type in loop condition (datahub-project#11089) * chore(vulnerability): Insertion of sensitive information into log files (datahub-project#11088) * chore(vulnerability): Risky Cryptographic Algorithm (datahub-project#11059) * chore(vulnerability): Overly permissive regex range (datahub-project#11061) Co-authored-by: Harshal Sheth <hsheth2@gmail.com> * fix: update customer data (datahub-project#11075) * fix(models): fixing the datasetPartition models (datahub-project#11085) Co-authored-by: John Joyce <john@ip-192-168-1-200.us-west-2.compute.internal> * fix(ui): Adding view, forms GraphQL query, remove showing a fallback error message on unhandled GraphQL error (datahub-project#11084) Co-authored-by: John Joyce <john@ip-192-168-1-200.us-west-2.compute.internal> * feat(docs-site): hiding learn more from cloud page (datahub-project#11097) * fix(docs): Add correct usage of orFilters in search API docs (datahub-project#11082) Co-authored-by: Jay <159848059+jayacryl@users.noreply.github.com> * fix(ingest/mode): Regexp in mode name matcher didn't allow underscore (datahub-project#11098) * docs: Refactor customer stories section (datahub-project#10869) Co-authored-by: Jeff Merrick <jeff@wireform.io> * fix(release): fix full/slim suffix on tag (datahub-project#11087) * feat(config): support alternate hashing algorithm for doc id (datahub-project#10423) Co-authored-by: david-leifker <114954101+david-leifker@users.noreply.github.com> Co-authored-by: John Joyce <john@acryl.io> * fix(emitter): fix typo in get method of java kafka emitter (datahub-project#11007) * fix(ingest): use correct native data type in all SQLAlchemy sources by compiling data type using dialect (datahub-project#10898) Co-authored-by: Harshal Sheth <hsheth2@gmail.com> * chore: Update contributors list in PR labeler (datahub-project#11105) * feat(ingest): tweak stale entity removal messaging (datahub-project#11064) * fix(ingestion): enforce lastObserved timestamps in SystemMetadata (datahub-project#11104) * fix(ingest/powerbi): fix broken lineage between chart and dataset (datahub-project#11080) * feat(ingest/lookml): CLL support for sql set in sql_table_name attribute of lookml view (datahub-project#11069) * docs: update graphql docs on forms & structured properties (datahub-project#11100) * test(search): search openAPI v3 test (datahub-project#11049) * fix(ingest/tableau): prevent empty site content urls (datahub-project#11057) Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> * feat(entity-client): implement client batch interface (datahub-project#11106) * fix(snowflake): avoid reporting warnings/info for sys tables (datahub-project#11114) * fix(ingest): downgrade column type mapping warning to info (datahub-project#11115) * feat(api): add AuditStamp to the V3 API entity/aspect response (datahub-project#11118) * fix(ingest/redshift): replace r'\n' with '\n' to avoid token error redshift serverless… (datahub-project#11111) * fix(entiy-client): handle null entityUrn case for restli (datahub-project#11122) * fix(sql-parser): prevent bad urns from alter table lineage (datahub-project#11092) * fix(ingest/bigquery): use small batch size if use_tables_list_query_v2 is set (datahub-project#11121) * fix(graphql): add missing entities to EntityTypeMapper and EntityTypeUrnMapper (datahub-project#10366) * feat(ui): Changes to allow editable dataset name (datahub-project#10608) Co-authored-by: Jay Kadambi <jayasimhan_venkatadri@optum.com> * fix: remove saxo (datahub-project#11127) * feat(mcl-processor): Update mcl processor hooks (datahub-project#11134) * fix(openapi): fix openapi v2 endpoints & v3 documentation update * Revert "fix(openapi): fix openapi v2 endpoints & v3 documentation update" This reverts commit 573c1cb. * docs(policies): updates to policies documentation (datahub-project#11073) * fix(openapi): fix openapi v2 and v3 docs update (datahub-project#11139) * feat(auth): grant type and acr values custom oidc parameters support (datahub-project#11116) * fix(mutator): mutator hook fixes (datahub-project#11140) * feat(search): support sorting on multiple fields (datahub-project#10775) * feat(ingest): various logging improvements (datahub-project#11126) * fix(ingestion/lookml): fix for sql parsing error (datahub-project#11079) Co-authored-by: Harshal Sheth <hsheth2@gmail.com> * feat(docs-site) cloud page spacing and content polishes (datahub-project#11141) * feat(ui) Enable editing structured props on fields (datahub-project#11042) * feat(tests): add md5 and last computed to testResult model (datahub-project#11117) * test(openapi): openapi regression smoke tests (datahub-project#11143) * fix(airflow): fix tox tests + update docs (datahub-project#11125) * docs: add chime to adoption stories (datahub-project#11142) * fix(ingest/databricks): Updating code to work with Databricks sdk 0.30 (datahub-project#11158) * fix(kafka-setup): add missing script to image (datahub-project#11190) * fix(config): fix hash algo config (datahub-project#11191) * test(smoke-test): updates to smoke-tests (datahub-project#11152) * fix(elasticsearch): refactor idHashAlgo setting (datahub-project#11193) * chore(kafka): kafka version bump (datahub-project#11211) * readd UsageStatsWorkUnit * fix merge problems * change logo --------- Co-authored-by: Chris Collins <chriscollins3456@gmail.com> Co-authored-by: John Joyce <john@acryl.io> Co-authored-by: John Joyce <john@Johns-MBP.lan> Co-authored-by: John Joyce <john@ip-192-168-1-200.us-west-2.compute.internal> Co-authored-by: dushayntAW <158567391+dushayntAW@users.noreply.github.com> Co-authored-by: sagar-salvi-apptware <159135491+sagar-salvi-apptware@users.noreply.github.com> Co-authored-by: Aseem Bansal <asmbansal2@gmail.com> Co-authored-by: Kevin Chun <kevin1chun@gmail.com> Co-authored-by: jordanjeremy <72943478+jordanjeremy@users.noreply.github.com> Co-authored-by: skrydal <piotr.skrydalewicz@gmail.com> Co-authored-by: Harshal Sheth <hsheth2@gmail.com> Co-authored-by: david-leifker <114954101+david-leifker@users.noreply.github.com> Co-authored-by: sid-acryl <155424659+sid-acryl@users.noreply.github.com> Co-authored-by: Julien Jehannet <80408664+aviv-julienjehannet@users.noreply.github.com> Co-authored-by: Hendrik Richert <github@richert.li> Co-authored-by: Hendrik Richert <hendrik.richert@swisscom.com> Co-authored-by: RyanHolstien <RyanHolstien@users.noreply.github.com> Co-authored-by: Felix Lüdin <13187726+Masterchen09@users.noreply.github.com> Co-authored-by: Pirry <158024088+chardaway@users.noreply.github.com> Co-authored-by: Hyejin Yoon <0327jane@gmail.com> Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Co-authored-by: cburroughs <chris.burroughs@gmail.com> Co-authored-by: ksrinath <ksrinath@users.noreply.github.com> Co-authored-by: Mayuri Nehate <33225191+mayurinehate@users.noreply.github.com> Co-authored-by: Kunal-kankriya <127090035+Kunal-kankriya@users.noreply.github.com> Co-authored-by: Shirshanka Das <shirshanka@apache.org> Co-authored-by: ipolding-cais <155455744+ipolding-cais@users.noreply.github.com> Co-authored-by: Tamas Nemeth <treff7es@gmail.com> Co-authored-by: Shubham Jagtap <132359390+shubhamjagtap639@users.noreply.github.com> Co-authored-by: haeniya <yanik.haeni@gmail.com> Co-authored-by: Yanik Häni <Yanik.Haeni1@swisscom.com> Co-authored-by: Gabe Lyons <itsgabelyons@gmail.com> Co-authored-by: Gabe Lyons <gabe.lyons@acryl.io> Co-authored-by: 808OVADOZE <52988741+shtephlee@users.noreply.github.com> Co-authored-by: noggi <anton.kuraev@acryl.io> Co-authored-by: Nicholas Pena <npena@foursquare.com> Co-authored-by: Jay <159848059+jayacryl@users.noreply.github.com> Co-authored-by: ethan-cartwright <ethan.cartwright.m@gmail.com> Co-authored-by: Ethan Cartwright <ethan.cartwright@acryl.io> Co-authored-by: Nadav Gross <33874964+nadavgross@users.noreply.github.com> Co-authored-by: Patrick Franco Braz <patrickfbraz@poli.ufrj.br> Co-authored-by: pie1nthesky <39328908+pie1nthesky@users.noreply.github.com> Co-authored-by: Joel Pinto Mata (KPN-DSH-DEX team) <130968841+joelmataKPN@users.noreply.github.com> Co-authored-by: Ellie O'Neil <110510035+eboneil@users.noreply.github.com> Co-authored-by: Ajoy Majumdar <ajoymajumdar@hotmail.com> Co-authored-by: deepgarg-visa <149145061+deepgarg-visa@users.noreply.github.com> Co-authored-by: Tristan Heisler <tristankheisler@gmail.com> Co-authored-by: Andrew Sikowitz <andrew.sikowitz@acryl.io> Co-authored-by: Davi Arnaut <davi.arnaut@acryl.io> Co-authored-by: Pedro Silva <pedro@acryl.io> Co-authored-by: amit-apptware <132869468+amit-apptware@users.noreply.github.com> Co-authored-by: Sam Black <sam.black@acryl.io> Co-authored-by: Raj Tekal <varadaraj_tekal@optum.com> Co-authored-by: Steffen Grohsschmiedt <gitbhub@steffeng.eu> Co-authored-by: jaegwon.seo <162448493+wornjs@users.noreply.github.com> Co-authored-by: Renan F. Lima <51028757+lima-renan@users.noreply.github.com> Co-authored-by: Matt Exchange <xkollar@users.noreply.github.com> Co-authored-by: Jonny Dixon <45681293+acrylJonny@users.noreply.github.com> Co-authored-by: Pedro Silva <pedro.cls93@gmail.com> Co-authored-by: Pinaki Bhattacharjee <pinakipb2@gmail.com> Co-authored-by: Jeff Merrick <jeff@wireform.io> Co-authored-by: skrydal <piotr.skrydalewicz@acryl.io> Co-authored-by: AndreasHegerNuritas <163423418+AndreasHegerNuritas@users.noreply.github.com> Co-authored-by: jayasimhankv <145704974+jayasimhankv@users.noreply.github.com> Co-authored-by: Jay Kadambi <jayasimhan_venkatadri@optum.com> Co-authored-by: David Leifker <david.leifker@acryl.io>

github-actions bot added the devops PR or Issue related to DataHub backend & deployment label Aug 9, 2024

feat(mcl-processor): Update mcl processor hooks optionally run with d…

f1187c0

…ifferent consumer groups

david-leifker force-pushed the mcl-hook-consumer-groups branch from 60e05a3 to f1187c0 Compare August 9, 2024 15:03

coderabbitai bot reviewed Aug 9, 2024

View reviewed changes

vercel bot deployed to Preview August 9, 2024 15:30 View deployment

RyanHolstien approved these changes Aug 9, 2024

View reviewed changes

add docs

7122214

coderabbitai bot reviewed Aug 9, 2024

View reviewed changes

vercel bot deployed to Preview August 9, 2024 16:10 View deployment

david-leifker merged commit 080f2a2 into datahub-project:master Aug 9, 2024
37 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(mcl-processor): Update mcl processor hooks #11134

feat(mcl-processor): Update mcl processor hooks #11134

david-leifker commented Aug 9, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Aug 9, 2024 •

edited

Loading

Chat

CodeRabbit Commands (invoked as PR comments)

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

coderabbitai bot left a comment

coderabbitai bot Aug 9, 2024

coderabbitai bot Aug 9, 2024

coderabbitai bot Aug 9, 2024

coderabbitai bot Aug 9, 2024

coderabbitai bot Aug 9, 2024

RyanHolstien commented Aug 9, 2024

coderabbitai bot left a comment

coderabbitai bot Aug 9, 2024

		The various MCL Hooks could alsp be separated into separate groups which allows for controlling parallelization and
		prioritization of the hooks.

feat(mcl-processor): Update mcl processor hooks #11134

feat(mcl-processor): Update mcl processor hooks #11134

Conversation

david-leifker commented Aug 9, 2024 • edited by coderabbitai bot Loading

Checklist

Summary by CodeRabbit

coderabbitai bot commented Aug 9, 2024 • edited Loading

Walkthrough

Changes

Sequence Diagram(s)

Chat

CodeRabbit Commands (invoked as PR comments)

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Aug 9, 2024

Choose a reason for hiding this comment

coderabbitai bot Aug 9, 2024

Choose a reason for hiding this comment

coderabbitai bot Aug 9, 2024

Choose a reason for hiding this comment

coderabbitai bot Aug 9, 2024

Choose a reason for hiding this comment

coderabbitai bot Aug 9, 2024

Choose a reason for hiding this comment

RyanHolstien commented Aug 9, 2024

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Aug 9, 2024

Choose a reason for hiding this comment

david-leifker commented Aug 9, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Aug 9, 2024 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)