Add multi stream ingestion support #13790

lnbest0707-uber · 2024-08-09T23:20:26Z

feature
Reference: #13780 Design Doc

Please refer to design doc for details. TLDR:

Add support to ingest from multiple source by a single table
Use existing interface (TableConfig) to define multiple streams
Separate the partition id definition between Stream and Pinot segment
Compatible with existing stream partition auto expansion logics

Feature tested on multiple Kafka topics with different decoder format. Due to resource limitations, not able to test other upstream source e2e.

Some TODOs:

Validation and Limitation on multiple stream configs.
Standardize the usage of StreamConfig object. e.g. some are only used to get non-topic related static metadata, should use other interface.
Adding/removing stream support or sanity check.

From user point of view, the feature does not change any existing interfaces. Users could define the table config in the same way and combine with any other transform functions or instance assignment strategies. A sample table ingestion config would look like:

"ingestionConfig": {
      "streamIngestionConfig": {
        "streamConfigMaps": [
          {
            "realtime.segment.flush.threshold.rows": "0",
            "stream.kafka.decoder.class.name": "xxxxDecoder",
            "streamType": "kafka",
            "stream.kafka.consumer.type": "lowlevel",
            "realtime.segment.flush.threshold.segment.size": "200MB",
            "stream.kafka.broker.list": "<host>:<port>",
            "realtime.segment.flush.threshold.time": "7200000",
            "stream.kafka.consumer.prop.auto.offset.reset": "largest",
            "stream.kafka.topic.name": "topicName1"
          },
          {
            "realtime.segment.flush.threshold.rows": "0",
            "stream.kafka.decoder.class.name": "xxxxDecoder",
            "streamType": "kafka",
            "stream.kafka.consumer.type": "lowlevel",
            "realtime.segment.flush.threshold.segment.size": "200MB",
            "stream.kafka.broker.list": "<host>:<port>"",
            "realtime.segment.flush.threshold.time": "7200000",
            "stream.kafka.consumer.prop.auto.offset.reset": "largest",
            "stream.kafka.topic.name": "topicName2"
          }
        ],
        "columnMajorSegmentBuilderEnabled": true
      },
      "transformConfigs": [
        {
          "columnName": "_ingestionEpochMs",
          "transformFunction": "__metadata$recordTimestamp"
        }
      ],
      "schemaConformingTransformerV2Config": {
        "enableIndexableExtras": true,
        "indexableExtrasField": "json_data",
        "enableUnindexableExtras": true,
        "unindexableExtrasField": "json_data_no_idx",
        "unindexableFieldSuffix": "_noindex",
        "fieldPathsToDrop": [],
        "fieldPathsToSkipStorage": [],
        "columnNameToJsonKeyPathMap": {},
        "mergedTextIndexField": "__mergedTextIndex",
        "useAnonymousDotInFieldNames": true,
        "optimizeCaseInsensitiveSearch": false,
        "reverseTextIndexKeyValueOrder": true,
        "mergedTextIndexDocumentMaxLength": 32766,
        "mergedTextIndexBinaryDocumentDetectionMinLength": 512,
        "fieldsToDoubleIngest": [],
        "jsonKeyValueSeparator": "\u001e",
        "mergedTextIndexBeginOfDocAnchor": "\u0002",
        "mergedTextIndexEndOfDocAnchor": "\u0003",
        "fieldPathsToPreserveInput": [],
        "fieldPathsToPreserveInputWithIndex": []
      },
      "continueOnError": false,
      "rowTimeValueCheck": false,
      "segmentTimeValueCheck": true
    }

codecov-commenter · 2024-08-10T00:25:48Z

Codecov Report

Attention: Patch coverage is 63.21839% with 64 lines in your changes missing coverage. Please review.

Project coverage is 64.03%. Comparing base (59551e4) to head (d8f46da).
Report is 1483 commits behind head on master.

Files with missing lines	Patch %	Lines
...g/apache/pinot/spi/utils/IngestionConfigUtils.java	53.33%	11 Missing and 3 partials ⚠️
...inot/spi/stream/PartitionGroupMetadataFetcher.java	65.71%	11 Missing and 1 partial ⚠️
.../core/realtime/PinotLLCRealtimeSegmentManager.java	84.21%	9 Missing ⚠️
...apache/pinot/controller/BaseControllerStarter.java	0.00%	5 Missing ⚠️
...x/core/realtime/MissingConsumingSegmentFinder.java	44.44%	5 Missing ⚠️
...r/validation/RealtimeSegmentValidationManager.java	0.00%	5 Missing ⚠️
...he/pinot/segment/local/utils/TableConfigUtils.java	50.00%	3 Missing and 2 partials ⚠️
...roller/helix/core/PinotTableIdealStateBuilder.java	0.00%	3 Missing ⚠️
.../helix/core/realtime/SegmentCompletionManager.java	0.00%	2 Missing ⚠️
...a/manager/realtime/RealtimeSegmentDataManager.java	87.50%	1 Missing ⚠️
... and 3 more

Additional details and impacted files

@@             Coverage Diff              @@
##             master   #13790      +/-   ##
============================================
+ Coverage     61.75%   64.03%   +2.27%     
- Complexity      207     1605    +1398     
============================================
  Files          2436     2703     +267     
  Lines        133233   149053   +15820     
  Branches      20636    22849    +2213     
============================================
+ Hits          82274    95439   +13165     
- Misses        44911    46620    +1709     
- Partials       6048     6994     +946

Flag	Coverage Δ
custom-integration1	`100.00% <ø> (+99.99%)`	⬆️
integration	`100.00% <ø> (+99.99%)`	⬆️
integration1	`100.00% <ø> (+99.99%)`	⬆️
integration2	`0.00% <ø> (ø)`
java-11	`63.95% <63.21%> (+2.24%)`	⬆️
java-21	`63.92% <63.21%> (+2.30%)`	⬆️
skip-bytebuffers-false	`63.97% <63.21%> (+2.22%)`	⬆️
skip-bytebuffers-true	`63.90% <63.21%> (+36.17%)`	⬆️
temurin	`64.03% <63.21%> (+2.27%)`	⬆️
unittests	`64.02% <63.21%> (+2.27%)`	⬆️
unittests1	`56.29% <33.70%> (+9.40%)`	⬆️
unittests2	`34.47% <52.29%> (+6.74%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

deemoliu · 2024-09-17T23:48:34Z

...t-segment-local/src/test/java/org/apache/pinot/segment/local/utils/TableConfigUtilsTest.java

@@ -686,9 +686,8 @@ public void ingestionStreamConfigsTest() {
    // only 1 stream config allowed


nit, update comment

deemoliu · 2024-09-18T00:02:12Z

pinot-segment-local/src/main/java/org/apache/pinot/segment/local/utils/TableConfigUtils.java

@@ -173,15 +173,18 @@ public static void validate(TableConfig tableConfig, @Nullable Schema schema, @N

      // Only allow realtime tables with non-null stream.type and LLC consumer.type
      if (tableConfig.getTableType() == TableType.REALTIME) {
-        Map<String, String> streamConfigMap = IngestionConfigUtils.getStreamConfigMap(tableConfig);
+        List<Map<String, String>> streamConfigMaps = IngestionConfigUtils.getStreamConfigMaps(tableConfig);


let's add a validation to avoid upsert table creating with multiple topics for now. One of the most important reason is that upsert table requires same primary keys to be distributed to the same host. it will be a bit complex to validate whether all source topics are partitioned equally (partition key, partition counts, parttion algorithms).
there are other potential concerns including race condition consumption during reload, rebalance, pause ingestion etc.

This is a good idea, updated.

deemoliu · 2024-09-18T00:11:56Z

...ain/java/org/apache/pinot/controller/helix/core/realtime/PinotLLCRealtimeSegmentManager.java

+    for (int i = 0; i < streamConfigs.size(); i++) {
+      final int index = i;
+      try {
+        partitionIds.addAll(getPartitionIds(streamConfigs.get(index)).stream().map(


can you please elaborate on why we don't need to maintain orders for partitionIds? we use list for streamConfigs and use unordered set to store partitionids?

This function is overloading an existing same name function. The other one is also returning a Set<>. Usage of the output is only checking if partitionId exists instead of checking its order.

lnbest0707-uber · 2024-11-14T00:00:59Z

Add the production running report:
The feature has been running in Uber production environment with PBs of data for months. There are hundreds of Pinot tables created. One table can be created with 20-30+ topics ingested with no issues. The overall ingestion and query performance is also competitive with the common single topic ingestions.

Notes: the feature is running with Kafka's multiple topics ingestion. We do not have resources to run it with other or multiple type of streams.

itschrispeck · 2024-11-19T22:48:00Z

...main/java/org/apache/pinot/controller/helix/core/realtime/MissingConsumingSegmentFinder.java

+    _streamPartitionMsgOffsetFactory =
+        StreamConsumerFactoryProvider.create(streamConfigs.get(0)).createStreamMsgOffsetFactory();


I think this breaks when mixing streams that do not use the same offset factory type, e.g. kinesis and kafka. (there's a lot of this specific case for offset factory, won't mark all)

We could UT, or shall we add a TODO for them since we can't easily test e2e internally?

Will add a enforcement check when we fetch the streamConfigs to enforce them to be same for now.
In long term, we need to redefine the structure of streamConfig for the usage.

itschrispeck · 2024-11-19T22:52:44Z

...ain/java/org/apache/pinot/controller/helix/core/realtime/PinotLLCRealtimeSegmentManager.java

@@ -1294,7 +1315,7 @@ IdealState ensureAllPartitionsConsuming(TableConfig tableConfig, StreamConfig st
                selectStartOffset(offsetCriteria, partitionId, partitionIdToStartOffset,
                    partitionIdToSmallestOffset, tableConfig.getTableName(), offsetFactory,
                    latestSegmentZKMetadata.getEndOffset());
-            createNewConsumingSegment(tableConfig, streamConfig, latestSegmentZKMetadata, currentTimeMs,
+            createNewConsumingSegment(tableConfig, streamConfigs.get(0), latestSegmentZKMetadata, currentTimeMs,


Can we use the partitionId to choose the correct streamConfig?

Or we'd need to document that segment flush settings are only used from the first streamConfig in the table config's list (though I feel different flush settings per stream will eventually be a future requirement)

itschrispeck · 2024-11-19T23:34:52Z

...main/java/org/apache/pinot/controller/helix/core/realtime/MissingConsumingSegmentFinder.java

@@ -87,6 +89,33 @@ public MissingConsumingSegmentFinder(String realtimeTableName, ZkHelixPropertySt
    }
  }

+  public MissingConsumingSegmentFinder(String realtimeTableName, ZkHelixPropertyStore<ZNRecord> propertyStore,


The old constructor is no longer used, can we remove it and update the tests?

itschrispeck · 2024-11-19T23:36:42Z

pinot-spi/src/main/java/org/apache/pinot/spi/stream/PartitionGroupMetadataFetcher.java

+    _streamConfigs = streamConfigs;
+    _partitionGroupConsumptionStatusList = partitionGroupConsumptionStatusList;
+    _newPartitionGroupMetadataList = new ArrayList<>();
+  }

  public PartitionGroupMetadataFetcher(StreamConfig streamConfig,


Similar here, let's remove the unused constructor

itschrispeck · 2024-11-19T23:39:25Z

pinot-spi/src/main/java/org/apache/pinot/spi/utils/IngestionConfigUtils.java

+   * @param tableConfig realtime table config
+   * @return streamConfigs List of maps
+   */
+  public static List<Map<String, String>> getStreamConfigMaps(TableConfig tableConfig) {


Can we remove the old method if it is no longer used?

itschrispeck · 2024-11-19T23:46:31Z

...ain/java/org/apache/pinot/controller/helix/core/realtime/PinotLLCRealtimeSegmentManager.java

    for (Map.Entry<Integer, LLCSegmentName> entry : partitionGroupIdToLatestSegment.entrySet()) {
      int partitionGroupId = entry.getKey();
      LLCSegmentName llcSegmentName = entry.getValue();
      SegmentZKMetadata segmentZKMetadata =
-          getSegmentZKMetadata(streamConfig.getTableNameWithType(), llcSegmentName.getSegmentName());
+          getSegmentZKMetadata(streamConfigs.get(0).getTableNameWithType(), llcSegmentName.getSegmentName());


nit: idealState.getId() instead of .get(0)?

itschrispeck · 2024-11-19T23:52:23Z

...ain/java/org/apache/pinot/controller/helix/core/realtime/PinotLLCRealtimeSegmentManager.java

-                : getPartitionGroupConsumptionStatusList(idealState, streamConfig);
-        OffsetCriteria originalOffsetCriteria = streamConfig.getOffsetCriteria();
+                : getPartitionGroupConsumptionStatusList(idealState, streamConfigs);
+        // FIXME: Right now, we assume topics are sharing same offset criteria


Does it make sense to add a precondition to check this?

Jackie-Jiang · 2024-11-19T22:43:52Z

pinot-spi/src/main/java/org/apache/pinot/spi/utils/IngestionConfigUtils.java

+   * @param tableConfig realtime table config
+   * @return streamConfigs List of maps
+   */
+  public static List<Map<String, String>> getStreamConfigMaps(TableConfig tableConfig) {


Add deprecated annotation to the old getStreamConfigMap() if we are not removing it

Removed the old one.

Jackie-Jiang · 2024-11-19T22:47:15Z

pinot-spi/src/main/java/org/apache/pinot/spi/utils/IngestionConfigUtils.java

+        && tableConfig.getIngestionConfig().getStreamIngestionConfig() != null) {
+      List<Map<String, String>> streamConfigMaps =
+          tableConfig.getIngestionConfig().getStreamIngestionConfig().getStreamConfigMaps();
+      Preconditions.checkState(streamConfigMaps.size() > 0, "Table must have at least 1 stream");


(nit)

Suggested change

Preconditions.checkState(streamConfigMaps.size() > 0, "Table must have at least 1 stream");

Preconditions.checkState(!streamConfigMaps.isEmpty(), "Table must have at least 1 stream");

Jackie-Jiang · 2024-11-19T22:47:49Z

pinot-spi/src/main/java/org/apache/pinot/spi/utils/IngestionConfigUtils.java

+      List<Map<String, String>> streamConfigMaps =
+          tableConfig.getIngestionConfig().getStreamIngestionConfig().getStreamConfigMaps();
+      Preconditions.checkState(streamConfigMaps.size() > 0, "Table must have at least 1 stream");
+      // For now, with multiple topics, we only support same type of stream (e.g. Kafka)


What is the reason for this limitation? Some comments explaining this would be good.
Only apply this check when there are multiple streams to match the current behavior

Added detailed explanations. Basically it is due to our resources not able to cover the testing of other stream types.

Jackie-Jiang · 2024-11-20T07:20:39Z

pinot-controller/src/main/java/org/apache/pinot/controller/BaseControllerStarter.java

-              streamConfigMap);
+          for (Map<String, String> streamConfigMap : streamConfigMaps) {
+            StreamConfig.validateConsumerType(
+                streamConfigMap.getOrDefault(StreamConfigProperties.STREAM_TYPE, "kafka"),


(format) Not comply to Pinot Style

Not sure how this works, but the mvn check style could pass.

Jackie-Jiang · 2024-11-20T07:22:41Z

...roller/src/main/java/org/apache/pinot/controller/helix/core/PinotTableIdealStateBuilder.java

+   * @param partitionGroupConsumptionStatusList
+   * @return
+   */
+  public static List<PartitionGroupMetadata> getPartitionGroupMetadataList(List<StreamConfig> streamConfigs,


Deprecate the old method or remove it. Please also clean up all usages of the old one

lnbest0707-uber · 2024-11-27T01:28:23Z

Thanks @itschrispeck for identifying a edge case issue and proposing the fix in the commit 6bd1307 to address the missing segments issue if cannot fetch partition metadata from the stream.

itschrispeck

Overall LGTM, suggest we thoroughly document the current limitations (e.g. no mixed stream types)

...re/src/main/java/org/apache/pinot/core/data/manager/realtime/RealtimeSegmentDataManager.java

...t-segment-local/src/test/java/org/apache/pinot/segment/local/utils/TableConfigUtilsTest.java

pinot-spi/src/main/java/org/apache/pinot/spi/utils/IngestionConfigUtils.java

...ain/java/org/apache/pinot/controller/helix/core/realtime/PinotLLCRealtimeSegmentManager.java

…ments Summary: Ensure transient exceptions do not prevent creating new consuming segments. If some exception is hit, attempt to reconcile any successful fetches with partition group metadata. This ensures consuming partitions are not dropped, and attempts to add and new partitions discovered successfully. Test Plan: After deployment, despite still some `TransientConsumerException`, no new missing consuming segments appear {F1002071843} {F1002071523} Reviewers: gaoxin, tingchen Reviewed By: gaoxin JIRA Issues: EVA-8951 Differential Revision: https://code.uberinternal.com/D15748639

lnbest0707-uber · 2024-12-18T02:20:30Z

@Jackie-Jiang could you pls review again and see if we still have blockers before merging? Thanks

kishoreg · 2024-12-18T03:50:21Z

Thanks again for contributing this feature. Is there a user doc associated with this feature

lnbest0707-uber · 2024-12-18T18:36:12Z

Thanks again for contributing this feature. Is there a user doc associated with this feature

@kishoreg thanks for bringing this up. After the PR merged, I would update the pinot-doc with the feature and its usage. In general, users could use the exact same way to define the table config with existing interfaces. I would provide an example in PR description.

Jackie-Jiang added feature release-notes Referenced by PRs that need attention when compiling the next release notes ingestion real-time labels Aug 12, 2024

deemoliu reviewed Sep 17, 2024

View reviewed changes

deemoliu reviewed Sep 18, 2024

View reviewed changes

lnbest0707-uber force-pushed the upstream-fork/multi_topics branch from 0bfb91f to 1af31b2 Compare October 17, 2024 06:06

lnbest0707-uber force-pushed the upstream-fork/multi_topics branch from 1af31b2 to 87dacbd Compare November 8, 2024 01:14

itschrispeck reviewed Nov 19, 2024

View reviewed changes

Jackie-Jiang reviewed Nov 20, 2024

View reviewed changes

lnbest0707-uber force-pushed the upstream-fork/multi_topics branch from 87dacbd to cae4dc5 Compare November 22, 2024 00:35

itschrispeck approved these changes Dec 6, 2024

View reviewed changes

chenboat reviewed Dec 6, 2024

View reviewed changes

...re/src/main/java/org/apache/pinot/core/data/manager/realtime/RealtimeSegmentDataManager.java Show resolved Hide resolved

chenboat reviewed Dec 6, 2024

View reviewed changes

...t-segment-local/src/test/java/org/apache/pinot/segment/local/utils/TableConfigUtilsTest.java Outdated Show resolved Hide resolved

chenboat reviewed Dec 6, 2024

View reviewed changes

pinot-spi/src/main/java/org/apache/pinot/spi/utils/IngestionConfigUtils.java Outdated Show resolved Hide resolved

chenboat reviewed Dec 7, 2024

View reviewed changes

...ain/java/org/apache/pinot/controller/helix/core/realtime/PinotLLCRealtimeSegmentManager.java Outdated Show resolved Hide resolved

chenboat approved these changes Dec 13, 2024

View reviewed changes

lnbest0707-uber and others added 9 commits December 17, 2024 14:02

Add multi stream ingestion support

4da7f88

Fix UT

e5955ce

Fix issues, rebase and resolve comments

d1e3acc

Resolve comments

cf53217

Fix style

d3ca4a9

Resolve comments for optimizing java doc

b4b6e80

Edit doc/comment

34bdc12

Remove unrelated files

b0b433c

Rebase and resolve conflicts

ad34f17

lnbest0707-uber force-pushed the upstream-fork/multi_topics branch from ba53fa3 to ad34f17 Compare December 17, 2024 22:11

lnbest0707-uber added 2 commits December 17, 2024 14:16

Take the metadata fetch time change from the HEAD

ab7b85b

Resolve conflicts

d8f46da

chenboat merged commit 73abb21 into apache:master Dec 19, 2024
21 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add multi stream ingestion support #13790

Add multi stream ingestion support #13790

lnbest0707-uber commented Aug 9, 2024 •

edited

Loading

codecov-commenter commented Aug 10, 2024 •

edited

Loading

deemoliu Sep 17, 2024

lnbest0707-uber Oct 17, 2024

deemoliu Sep 18, 2024

lnbest0707-uber Oct 17, 2024

deemoliu Sep 18, 2024 •

edited

Loading

lnbest0707-uber Oct 17, 2024

lnbest0707-uber commented Nov 14, 2024 •

edited

Loading

itschrispeck Nov 19, 2024

lnbest0707-uber Nov 21, 2024

itschrispeck Nov 19, 2024

itschrispeck Nov 19, 2024

lnbest0707-uber Nov 21, 2024

itschrispeck Nov 19, 2024

lnbest0707-uber Nov 21, 2024

itschrispeck Nov 19, 2024

lnbest0707-uber Nov 21, 2024

itschrispeck Nov 19, 2024

itschrispeck Nov 19, 2024

Jackie-Jiang Nov 19, 2024

lnbest0707-uber Nov 21, 2024

Jackie-Jiang Nov 19, 2024

lnbest0707-uber Nov 21, 2024

Jackie-Jiang Nov 19, 2024

lnbest0707-uber Nov 21, 2024

Jackie-Jiang Nov 20, 2024

lnbest0707-uber Nov 21, 2024

Jackie-Jiang Nov 20, 2024

lnbest0707-uber Nov 21, 2024

lnbest0707-uber commented Nov 27, 2024

itschrispeck left a comment

lnbest0707-uber commented Dec 18, 2024

kishoreg commented Dec 18, 2024

lnbest0707-uber commented Dec 18, 2024

		@@ -686,9 +686,8 @@ public void ingestionStreamConfigsTest() {
		// only 1 stream config allowed

		_streamPartitionMsgOffsetFactory =
		StreamConsumerFactoryProvider.create(streamConfigs.get(0)).createStreamMsgOffsetFactory();

	Preconditions.checkState(streamConfigMaps.size() > 0, "Table must have at least 1 stream");
	Preconditions.checkState(!streamConfigMaps.isEmpty(), "Table must have at least 1 stream");

Add multi stream ingestion support #13790

Add multi stream ingestion support #13790

Conversation

lnbest0707-uber commented Aug 9, 2024 • edited Loading

codecov-commenter commented Aug 10, 2024 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

deemoliu Sep 18, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lnbest0707-uber commented Nov 14, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lnbest0707-uber commented Nov 27, 2024

itschrispeck left a comment

Choose a reason for hiding this comment

lnbest0707-uber commented Dec 18, 2024

kishoreg commented Dec 18, 2024

lnbest0707-uber commented Dec 18, 2024

lnbest0707-uber commented Aug 9, 2024 •

edited

Loading

codecov-commenter commented Aug 10, 2024 •

edited

Loading

deemoliu Sep 18, 2024 •

edited

Loading

lnbest0707-uber commented Nov 14, 2024 •

edited

Loading