Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[server][controller] Add MaterializedViewWriter and support view writers in L/F #1296

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

xunyin8
Copy link
Contributor

@xunyin8 xunyin8 commented Nov 12, 2024

[server][controller] Add MaterializedViewWriter and support view writers in L/F

  1. View writers will be invoked in L/F SIT too instead of only in A/A SIT. We rely on view config validation to ensure views that do require A/A are only added to stores with A/A enabled.

  2. This PR only includes creation of materialized view topics, writing of data records and control messages to the materialized view topics in server and controller.

  • Materialized view topics are created during version creation time along with other view topics.
  • SOP is sent during view topic creation time with same chunking and compression configs as the store version.
  • EOP is sent when servers have reported EOP in every partition.
  • Incremental push control messages SOIP and EOIP are not propagated to the view topic for now because the end to end incremental push tracking story for view topics is not clear yet. Store owners will likely just disable the requirement to wait for view consumers to fully ingest the incremental push.
  • Ingestion heartbeats will be propagated in a broadcast manner. See implementation for details.
  • Version swap for CDC users will be implemented in a separate PR to keep this PR somewhat short for review.
  1. One issue to be resolved is that during processing of batch records in the native replication source fabric, where we consume local VT, a leader transfer could result in missing records in the materialized view topic. This is because we don't do any global checkpointing across leader and followers when consuming local VT. To solve this we will be producing to the view topic from the VPJ itself.

How was this PR tested?

Unit and integration tests

Does this PR introduce any user-facing changes?

  • No. You can skip the rest of this section.
  • Yes. Make sure to explain your proposed changes and call out the behavior change.

@xunyin8 xunyin8 force-pushed the RePartitionViewWriter branch 2 times, most recently from e43850a to a711263 Compare November 13, 2024 02:32
@xunyin8 xunyin8 force-pushed the RePartitionViewWriter branch 6 times, most recently from 791712b to bcee9bc Compare November 25, 2024 17:47
@@ -329,6 +331,8 @@ public LeaderFollowerStoreIngestionTask(
for (Map.Entry<String, VeniceViewWriter> viewWriter: viewWriters.entrySet()) {
if (viewWriter.getValue() instanceof ChangeCaptureViewWriter) {
tmpValueForHasChangeCaptureViewWriter = true;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why move the "break" out of the first if statement? The original if statement with break inside seems to neater. Maybe there is something I missed.

return setProducerOptimizations(internalView.getWriterOptionsBuilder(materializedViewTopicName, version)).build();
}

synchronized private void initializeVeniceWriter() {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

both ChangeCaptureViewWriter and this writer have the same initWriter method. Can we pull them to their common parent class?

Also, this may be an ask that went too far: instead of using the heavy weight of synchronized, I wonder whether using an AtomicReferenceFieldUpdater with which you don't have to change viewWriter? Or using an atomicReference, with which you will have to change the viewWriter to Reference of ViewWriter?

If the init method is not involved frequently, maybe it is not worthwhile to do such a refactor.

Comment on lines +1046 to +1052
viewConfigMap = viewConfigMap.entrySet()
.stream()
.filter(vc -> Objects.equals(vc.getValue().getViewClassName(), MaterializedView.class.getCanonicalName()))
.collect(Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue));
if (!viewConfigMap.isEmpty()) {
pushJobSetting.materializedViewConfigFlatMap = ViewUtils.flatViewConfigMapString(viewConfigMap);
}
Copy link

@lusong64 lusong64 Dec 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice streaming. And maybe you want to combine the !isEmpty to:

pushJobSetting.materializedViewConfigFlatMap = Optional.of (viewConfigMap = viewConfigMap.entrySet()
          .stream()
          .filter(vc -> Objects.equals(vc.getValue().getViewClassName(), MaterializedView.class.getCanonicalName()))
          .collect(Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue)))
          .map(ViewUtils::flatViewConfigMapString)
          .orElse(Collections::emptyMap);

…ers in L/F

1. View writers will be invoked in L/F SIT too instead of only in A/A SIT. We rely on
view config validation to ensure views that do require A/A are only added to stores
with A/A enabled.

2. This PR only includes creation of materialized view topics, writing of data
records and control messages to the materialized view topics  in server and controller.
  - Materialized view topics are created during version creation time along with other
    view topics.
  - SOP is sent during view topic creation time with same chunking and compression
    configs as the store version.
  - EOP is sent when servers have reported EOP in every partition.
  - Incremental push control messages SOIP and EOIP are not propagated to the view topic
    for now because the end to end incremental push tracking story for view topics is
    not clear yet. Store owners will likely just disable the requirement to wait for
    view consumers to fully ingest the incremental push.
  - Ingestion heartbeats will be propagated in a broadcast manner. See implementation
    for details.
  - Version swap for CDC users will be implemented in a separate PR to keep this PR
    somewhat short for review.

3. TODO: one pending issue to be resolved is that during processing of batch records
in the native replication source fabric, where we consume local VT, a leader transfer
could result in missing records in the materialized view topic. This is because we
don't do any global checkpointing across leader and followers when consuming local VT.
this.viewName = Objects.requireNonNull(viewName, "View name cannot be null for ViewParameters");
}

public Builder(String viewName, Map<String, String> viewParams) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a nitpick: An EnumMap can provide a better performance than the Map<String, String> given that all you store is enums.
"A specialized Map implementation for use with enum type keys. All of the keys in an enum map must come from a single enum type that is specified, explicitly or implicitly, when the map is created. Enum maps are represented internally as arrays. This representation is extremely compact and efficient."

LOGGER.info("Successfully {} for offlinePushStatus: {}", newStatusDetails.toString(), offlinePushStatus);
}
} catch (Exception e) {
String newStatusDetails = "Failed to start EOP procedures";
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit pick: this string is constant. Maybe promote it into a private static variable?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants