Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PIP-136: Sync Pulsar policies across multiple clouds #16424

Open
rdhabalia opened this issue Jul 7, 2022 · 2 comments
Open

PIP-136: Sync Pulsar policies across multiple clouds #16424

rdhabalia opened this issue Jul 7, 2022 · 2 comments
Assignees

Comments

@rdhabalia
Copy link
Contributor

rdhabalia commented Jul 7, 2022

Sync Pulsar policies across multiple clouds

Implementation : PR: #16425

Motivation

Apache Pulsar is a cloud-native, distributed messaging framework which natively provides geo-replication. Many organizations deploy pulsar instances on-prem and on multiple different cloud providers and at the same time they would like to enable replication between multiple clusters deployed in different cloud providers. Pulsar already provides various proxy options (Pulsar proxy/ enterprise proxy solutions on SNI) to fulfill security requirements when brokers are deployed on different security zones connected with each other.

global metadata store -2

However, sometimes it's not possible to share metadata-store (global zookeeper) between pulsar clusters deployed on separate cloud provider platforms, and synchronizing configuration metadata (policies) can be a critical path to share tenant/namespace/topic policies between clusters and administrate pulsar policies uniformly across all clusters. Therefore, we need a mechanism to sync configuration metadata between clusters deployed on the different cloud platforms.

Goal

Replicated metadata event topic

All regions in a cluster that share the same metadata-store (eg: global zookeeper which persists policies) are already in sync but they are not in sync with regions which are in different clusters and not sharing the same metadata-store. We want to sync clusters that are not sharing the same config metadata store and in order to synchronize metadata store, we can pick one region from each cluster and set up a replicated topic across those regions where they can exchange change of metadata event and try to sync cluster with the change of events occurred at different clusters.

This PIP will introduce metadata event-topic which will be replicated between isolated clusters that are not in sync and don’t share the same metadata store. In order to provide replication guarantee, each broker when it receives metadata update, first publishes metadata-change event to this replicated topic and then asynchronously applies the update to the metadata store by consuming metadata event.

Below data structure shows the payload of change events published into the event topic. Each event contains metadata value along metadata path, source cluster name and last-updated time of the event. Source-cluster name and last-update time help destination clusters to handle stale or duplicate events.

public class MetadataEvent {
    private String path;
    private byte[] value;
    private Set<CreateOption> options;
    private Long expectedVersion;
    private long lastUpdatedTimestamp;
    private String sourceCluster;
    private NotificationType type;
}

Handling race condition

Users can update the same policy with different values concurrently in different regions. Every region will eventually receive the updates from other remote regions where policy has been modified and Pulsar has to handle this scenario by merging (or selecting distinct value) the concurrent updates in a consistent manner across all regions. Therefore, each update contains modified-time and the name of source-region which has updated the value. Pulsar region compares local update and remote update based on latest modified timestamp and lexicographical ordering of source-region name and determines a final selected value deterministically across all regions and eventually all regions will have one distinct consistent value for the concurrently modified policy in the metadata store.

For example in the below diagram, Region-A and Region-B received an update for policy P1 at the same time T1. Both regions exchange the local event update with each other and both the regions have to pick only one distinct event from both the updates so, both the regions will have a consistent same update in the metadata store. First, each region compares events based on event updated timestamp and then based on the lexicographic ordering of source-region name. In this example, modified timestamp T1 is the same for both the events so, the next Pulsar selects event with source region-name A over source-region B based on lexicographic sorting on source region name. Therefore, both regions will eventually update metadata with a distinct event that occurred at region-A.

Race-condition

Implementation

Event publisher and handler

Publisher

Every broker receives a metadata-update publishes event message to the persistent topic (metadata-event-topic) which replicates to other clusters. Metadata store publishes change-event when it receives create/update/delete operation for metadata. After publishing the message, metadata-store immediately tries to perform create/put/delete operation asynchronously so, the metadata-store doesn't have to change the existing put/delete API which returns the metadata-store update Stat after making an update in the metadata-store. With this approach, local region tries to apply metadata-change event twice

  • async update when broker receives metadata-change request from user.
  • Metadata sync-topic listener receives the event and applies the change to metadata store.
    However, broker has race-condition handling which handles duplicate updates for the same event.

Consumer Handler

Every region consumes events from this topic and applies changes to the metadata store accordingly. This PIP introduces MetadataEventSynchronizer to publish metadata events and consume events from the topic and handles updates in the metadata store.
MetadataEventSynchronizer creates a failover consumer on the metadata-event-topic so, that only one of the brokers’ synchronizer can consume and handle the event update.

MetadataEventSynchronizer.java

**
 * Metadata synchronizer to notify and synchronize metadata change events.
 */
public interface MetadataEventSynchronizer {

    /**
     * Notify metadata change event.
     * @param event
     *            metadata change event.
     * @return
     */
    CompletableFuture<Void> notify(MetadataEvent event);

    /**
     * Register notification listener to sync metadata event in local cluster.
     * @param event
     */
    void registerSyncListener(Function<MetadataEvent, CompletableFuture<Void>> event);

    /**
     * Name of current cluster served by the Synchronizer.
     * @return clusterName
     */
    String getClusterName();

    /**
     * close synchronizer resources.
     */
    void close();
}

Broker changes

Configuration

# topic name to share metadata changes from local metadata store
private String metadataSyncEventTopic;

# topic name to share metadata changes from configuration metadata store
private long configurationMetadataSyncEventTopic;

MetadataEventSynchronizer implementation

PulsarMetadataEventSynchronizer which implements MetadataEventSynchronizer and handles metadata events’ notification and processing.

Event topic consumer and publisher

User can enable this feature by configuring metadataSyncEventTopic / configurationMetadataSyncEventTopic into broker and broker initializes MetadataEventSynchronizer component which creates failover consumer to listen and handle metadata’s change events. It also enables broker to publish metadata changes into event topic.

PIP:
#15223

Rejected alternative

  1. PIP-136: Sync Pulsar policies across multiple clouds #13728

  2. Use System-topic:
    Use System-topic to synchronize metadata across the cluster. It might not be the correct choice to utilize system-topic to handle metadata-store transportation. Because system topic helps broker to persist topic policies in that local cluster whereas Metadata-event synchronizer helps broker to copy metadata-store across two independent clusters which don't share metadata-store/global-zookeeper. Users will also not be able to use system-topic for metadata sync due to the below reasons:

  3. storage and reliability: Not every user prefers or uses the system topic for the metadata storage due to multiple reasons such as legacy-system, higher reliability on metadata-store compared to system-topic stored in bookies.

  4. Schema compatibility; System topic right now supports only topic level policies with a specific schema whereas the metadata change event requires a different schema for the metadata-store update.

  5. Merging and handling capabilities: Metadata change event not only requires different schema but also requires special handling for create/update and merging capabilities. It will require unnecessary enhancement on system-topic to support merging capabilities.

  6. Compaction requirement: system topic also requires compaction which all systems don't enable because compaction comes with an extra server-side cost which is very expensive for large scale and multi-tenant systems,

However, system-topic can work with metadata-synchronizer. System topic persists topic policies. Broker reads this compacted system topic to retrieve topic policies and applies them to the loaded topic. The broker can replicate metadata-store data to another destination broker that is part of a separate cluster using a metadata-synchronizer, and the destination broker can later persist policies in the local cluster by publishing them to system-topic.

@github-actions
Copy link

The issue had no activity for 30 days, mark with Stale label.

@hpvd
Copy link

hpvd commented Jun 6, 2024

@rdhabalia
since this PIP is mentioned in release notes of 2.11.0
see https://pulsar.apache.org/release-notes/versioned/pulsar-2.11.0/
-> can this issue be closed as fully implemented?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants