Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix][broker] Automatically create topic for remote cluster in geo #21679

Draft
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

nodece
Copy link
Member

@nodece nodece commented Dec 6, 2023

Fixes #21153

Motivation

Enable the geo-replication only on the topic level to replicate the partitioned topic, if the topic was not created in the remote cluster, it becomes the non-partition topic by the allowAutoTopicCreation=true and allowAutoTopicCreationType=non-partitioned configurations, which is not in line with the expected behavior. Therefore, I suggest that geo-replication automatically creates the topic for the remote cluster.

How to reproduce this issue:

  1. Create two clusters(cluster-a and cluster-b) and use the different configuration stores.
  2. Add cluster data to each cluster
  3. Create a tenant in each cluster:
bin/pulsar-admin tenants create my-tenant --allowed-clusters cluster-a,cluster-b
  1. Create a namespace in the cluster-a:
bin/pulsar-admin namespaces set-clusters my-tenant/my-namespace --clusters cluster-a
  1. Create a namespace in the cluster-b
bin/pulsar-admin namespaces set-clusters my-tenant/my-namespace --clusters cluster-b
  1. Create a topic and enable the geo-replication in the cluster-a:
bin/pulsar-admin topics create-partitioned-topic -p 5 my-tenant/my-namespace/geo-p-5
bin/pulsar-admin topics set-replication-clusters --clusters cluster-a,cluster-b my-tenant/my-namespace/geo-p-5
  1. List the partitioned topics in the cluster-b:
bin/pulsar-admin topics list-partitioned-topics my-tenant/my-namespace
empty

Modifications

  • For NonPersistentTopic and PersistentTopic, the geo automatically creates the topic for the remote cluster.
  • Use the different configuration stores in the geo-replication test

Verifying this change

  • Add a GeoReplicationOnTopicLevelTest to cover these changes.

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

@github-actions github-actions bot added the doc-not-needed Your PR changes do not impact docs label Dec 6, 2023
@liudezhi2098
Copy link
Contributor

Test cases can be added.

Signed-off-by: Zixuan Liu <nodeces@gmail.com>
@nodece nodece force-pushed the auto-create-partitioned-topic-for-remote-cluster branch from 5948f04 to 02de6e7 Compare December 7, 2023 02:30
@nodece nodece changed the title [fix][broker] Automatically create partitioned topic for remote cluster in geo [fix][broker] Automatically create topic for remote cluster in geo Dec 7, 2023
@nodece
Copy link
Member Author

nodece commented Dec 7, 2023

/pulsarbot rerun-failure-checks

@nodece
Copy link
Member Author

nodece commented Dec 7, 2023

Thank @liudezhi2098 for your suggest, the GeoReplicationTest can cover these changes, including persistent and non-persistent topics, but it cannot verify your issue because there is one configuration store.

I need to add a new test case with multiple configuration stores to verify #21153.

Signed-off-by: Zixuan Liu <nodeces@gmail.com>
Signed-off-by: Zixuan Liu <nodeces@gmail.com>
@nodece nodece force-pushed the auto-create-partitioned-topic-for-remote-cluster branch from ec4a1d4 to d8d689d Compare December 7, 2023 18:41
Throwable throwable = FutureUtil.unwrapCompletionException(ex);
if (throwable instanceof ConflictException) {
int code = ((ConflictException) throwable).getStatusCode();
if (code == Status.CONFLICT.getStatusCode()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe all the partitioned-topic tries to create topic for the remote cluster, so there will throw Conflict exception, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And also existed in the remote cluster?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe all the partitioned-topic tries to create topic for the remote cluster, so there will throw Conflict exception, right?

Rright. If Conflict is thrown, it indicates that the topic already exists in the remote cluster.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since load topic could try to create the topic in the remote cluster, the create operation and the error log may confuse the user. So it's better to query before create.
Another case is: if the user updates the partition in the local cluster(like updating partition from 3~5), the remote can't update.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since load topic could try to create the topic in the remote cluster, the create operation and the error log may confuse the user. So it's better to query before create.

The current implementation prints many logs if the topic has existed in the remote cluster. If we use the query before create, there are twice HTTP requests.

Another case is: if the user updates the partition in the local cluster(like updating partition from 3~5), the remote can't update.

If the user enables the geo-replication, we also need to update this topic for the remote cluster when updating the partition in the local cluster.

@nodece
Copy link
Member Author

nodece commented Dec 17, 2023

/pulsarbot rerun-failure-checks

@codecov-commenter
Copy link

Codecov Report

Merging #21679 (d8d689d) into master (1919a0e) will decrease coverage by 36.68%.
Report is 34 commits behind head on master.
The diff coverage is 79.16%.

Additional details and impacted files

Impacted file tree graph

@@              Coverage Diff              @@
##             master   #21679       +/-   ##
=============================================
- Coverage     73.29%   36.62%   -36.68%     
+ Complexity    32769    12145    -20624     
=============================================
  Files          1893     1716      -177     
  Lines        140745   131172     -9573     
  Branches      15503    14331     -1172     
=============================================
- Hits         103166    48041    -55125     
- Misses        29498    76726    +47228     
+ Partials       8081     6405     -1676     
Flag Coverage Δ
inttests 24.09% <70.83%> (?)
systests 24.71% <0.00%> (+0.08%) ⬆️
unittests 31.59% <33.33%> (-41.09%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
...oker/service/nonpersistent/NonPersistentTopic.java 42.01% <100.00%> (-27.57%) ⬇️
...sar/broker/service/persistent/PersistentTopic.java 51.78% <100.00%> (-27.00%) ⬇️
...rg/apache/pulsar/broker/service/AbstractTopic.java 59.45% <75.00%> (-27.17%) ⬇️

... and 1463 files with indirect coverage changes

@Technoboy- Technoboy- added this to the 3.3.0 milestone Dec 22, 2023
@codelipenghui
Copy link
Contributor

@nodece Is the issue happened on a shared configuration store or multiple configuration stores?

@nodece
Copy link
Member Author

nodece commented Jan 2, 2024

@nodece Is the issue happened on a shared configuration store or multiple configuration stores?

@codelipenghui It happened on multiple configuration stores.

@codelipenghui
Copy link
Contributor

@nodece Ok, I see. It think it could be a great topic to discuss under the mailing list. It should not be a fix here. It should a topic of how to handle topic creation for geo-replicated clusters with multiple configuration stores. Now, I think it's not clear enough in Pulsar. And for clusters with multiple configuration stores, users need to create the partitioned topic to multiple clusters manually.

For creating the partitioned topics to the remote cluster. We might also need to take the security issues into consideration, the credential to access the remote cluster should not always be the super user . And different clusters might have different limitations, e.g. max topics.

And there is another solution to sync the configuration store events across different clusters. I think it should be a related topic?

@nodece
Copy link
Member Author

nodece commented Jan 4, 2024

And for clusters with multiple configuration stores, users need to create the partitioned topic to multiple clusters manually.

@codelipenghui Sure, I agree with you. Usually, users need to create the topic on the remote clusters manually.

For creating the partitioned topics to the remote cluster. We might also need to take the security issues into consideration, the credential to access the remote cluster should not always be the super user . And different clusters might have different limitations, e.g. max topics.

This kind of behavior has existed for a long time, perhaps we did not consider max topics.

When the GEO is enabled on the namespace level, you create the topic on the local cluster, the broker also creates the topic on the remote cluster, please see:

if (!replicatedClusters.isEmpty()) {
replicatedClusters.forEach(cluster -> {
pulsar().getPulsarResources().getClusterResources().getClusterAsync(cluster)
.thenAccept(clusterDataOp -> {
((TopicsImpl) pulsar().getBrokerService()
.getClusterPulsarAdmin(cluster, clusterDataOp).topics())
.createPartitionedTopicAsync(
topicName.getPartitionedTopicName(), numPartitions, true, null);
})
.exceptionally(throwable -> {
log.error("Failed to create partition topic in cluster {}.", cluster, throwable);
return null;
});
});
}

And there is another #16424 to sync the configuration store events across different clusters. I think it should be a related topic?

#16424 is used to sync all configuration store to another cluster. For my case, I only sync some topic to another cluster by the GEO.

@codelipenghui
Copy link
Contributor

@nodece, I got your point.

Is it better to return an exception to users if the remote cluster doesn't have the partitioned topic? Then, users can create the partitioned topic for the remote cluster and enable geo-replication on the topic again.

The reason that I want to propose this solution is creating partitioned topics with set-replication-clusters is essentially a permission step over. A tenant admin can create partitioned topics on cluster-a doesn't mean can create topics on cluster-b. But now you can create a topic in another cluster under the guise of the permissions(usually it is superuser) of the client inside the broker.

Since it actually changed the behavior of the API. Could you please start with a PIP?

@liangyepianzhou
Copy link
Contributor

The reason that I want to propose this solution is creating partitioned topics with set-replication-clusters is essentially a permission step over. A tenant admin can create partitioned topics on cluster-a doesn't mean can create topics on cluster-b. But now you can create a topic in another cluster under the guise of the permissions(usually it is superuser) of the client inside the broker.

@codelipenghui In the current implementation If the geo-replication is enabled at the namespace level, the partition topic will be auto-created at the remote clusters when creating a topic at the local cluster.

@nodece nodece marked this pull request as draft March 22, 2024 08:40
@coderzc coderzc modified the milestones: 3.3.0, 3.4.0 May 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
doc-not-needed Your PR changes do not impact docs ready-to-test release/4.0.4
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug] Topic level geo configure, the partitioned topics can not auto created on remote cluster.
8 participants