Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pulsar-broker] Cleanup already deleted namespace topics #7473

Merged
merged 6 commits into from
Jul 14, 2020

Conversation

rdhabalia
Copy link
Contributor

Motivation

We are having frequent issues when user removes cluster from the global namespace where broker from removed-cluster fails to unload topic and namespace bundle still loaded with the broker. It happens when broker from removed-cluster receives below error

17:38:52.199 [pulsar-io-22-28] ERROR org.apache.pulsar.broker.service.persistent.PersistentReplicator - [persistent://prop/global/ns/tp1][east -> west] Failed to close dispatch rate limiter: org.apache.pulsar.client.api.PulsarClientException: Producer was not registered on the connection
:
17:38:52.199 [pulsar-io-22-28] WARN  org.apache.pulsar.broker.service.AbstractReplicator - [persistent://prop/global/ns/tp1][east -> west]] Exception: 'org.apache.pulsar.client.api.PulsarClientException: Producer was not registered on the connection' occured while trying to close the producer. retrying again in 0.1 s
:
17:38:52.351 [pulsar-io-22-37] ERROR org.apache.pulsar.broker.service.persistent.PersistentTopic - [persistent://prop/global/ns/tp1] Error closing topic
java.util.concurrent.CompletionException: org.apache.pulsar.client.api.PulsarClientException: Producer was not registered on the connection
        at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:331) ~[?:?]

Modification

Source Broker should return explicit error-code when producer is already closed and dest-broker from removed-cluster should handle this error and clean up the replicator and topic gracefully.

@rdhabalia rdhabalia added this to the 2.7.0 milestone Jul 7, 2020
@rdhabalia rdhabalia self-assigned this Jul 7, 2020
@rdhabalia
Copy link
Contributor Author

/pulsarbot run-failure-checks

@codelipenghui
Copy link
Contributor

/pulsarbot run-failure-checks

}).exceptionally(ex -> {
Throwable t = (ex instanceof CompletionException ? ex.getCause() : ex);
if (t instanceof TopicBusyException == false) {
if (t instanceof AlreadyClosedException) {
cleanAfterDisconnect(future);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we're treating it as a "success" case here, another option would be to have the close request to return "ok" when closing a non-existing producer. what do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@merlimat Yes, I don't see any issue by returning success on non-existing producer or consumer. I made that change. Can you please review it again.

@rdhabalia
Copy link
Contributor Author

/pulsarbot run-failure-checks

@merlimat merlimat merged commit 38155f6 into apache:master Jul 14, 2020
huangdx0726 pushed a commit to huangdx0726/pulsar that referenced this pull request Aug 24, 2020
* [pulsar-broker] Cleanup already deleted namespace topics

* add error-codes

* Revert "add error-codes"

This reverts commit 972ae3e.

* Revert "[pulsar-broker] Cleanup already deleted namespace topics"

This reverts commit 570132d.

* send success if producer is already closed

* send success if consumer is already closed
codelipenghui pushed a commit that referenced this pull request Nov 3, 2021
Cherry pick from #7473.

#7473 has fix the `Cleanup already deleted namespace topics` issue, but with #8129 involved, changes have been 
changed back.

### Motivation
We are having frequent issues when user removes cluster from the global namespace where broker from removed-cluster fails to unload topic and namespace bundle still loaded with the broker. It happens when broker from removed-cluster receives below error
```
17:38:52.199 [pulsar-io-22-28] ERROR org.apache.pulsar.broker.service.persistent.PersistentReplicator - [persistent://prop/global/ns/tp1][east -> west] Failed to close dispatch rate limiter: org.apache.pulsar.client.api.PulsarClientException: Producer was not registered on the connection
:
17:38:52.199 [pulsar-io-22-28] WARN  org.apache.pulsar.broker.service.AbstractReplicator - [persistent://prop/global/ns/tp1][east -> west]] Exception: 'org.apache.pulsar.client.api.PulsarClientException: Producer was not registered on the connection' occured while trying to close the producer. retrying again in 0.1 s
:
17:38:52.351 [pulsar-io-22-37] ERROR org.apache.pulsar.broker.service.persistent.PersistentTopic - [persistent://prop/global/ns/tp1] Error closing topic
java.util.concurrent.CompletionException: org.apache.pulsar.client.api.PulsarClientException: Producer was not registered on the connection
        at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:331) ~[?:?]
```

### Modification
Source Broker should return explicit error-code when producer is already closed and dest-broker from removed-cluster should handle this error and clean up the replicator and topic gracefully.
hangc0276 pushed a commit that referenced this pull request Nov 4, 2021
Cherry pick from #7473.

#7473 has fix the `Cleanup already deleted namespace topics` issue, but with #8129 involved, changes have been
changed back.

### Motivation
We are having frequent issues when user removes cluster from the global namespace where broker from removed-cluster fails to unload topic and namespace bundle still loaded with the broker. It happens when broker from removed-cluster receives below error
```
17:38:52.199 [pulsar-io-22-28] ERROR org.apache.pulsar.broker.service.persistent.PersistentReplicator - [persistent://prop/global/ns/tp1][east -> west] Failed to close dispatch rate limiter: org.apache.pulsar.client.api.PulsarClientException: Producer was not registered on the connection
:
17:38:52.199 [pulsar-io-22-28] WARN  org.apache.pulsar.broker.service.AbstractReplicator - [persistent://prop/global/ns/tp1][east -> west]] Exception: 'org.apache.pulsar.client.api.PulsarClientException: Producer was not registered on the connection' occured while trying to close the producer. retrying again in 0.1 s
:
17:38:52.351 [pulsar-io-22-37] ERROR org.apache.pulsar.broker.service.persistent.PersistentTopic - [persistent://prop/global/ns/tp1] Error closing topic
java.util.concurrent.CompletionException: org.apache.pulsar.client.api.PulsarClientException: Producer was not registered on the connection
        at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:331) ~[?:?]
```

### Modification
Source Broker should return explicit error-code when producer is already closed and dest-broker from removed-cluster should handle this error and clean up the replicator and topic gracefully.

(cherry picked from commit 6b3fb41)
eolivelli pushed a commit to eolivelli/pulsar that referenced this pull request Nov 29, 2021
Cherry pick from apache#7473.

apache#7473 has fix the `Cleanup already deleted namespace topics` issue, but with apache#8129 involved, changes have been 
changed back.

### Motivation
We are having frequent issues when user removes cluster from the global namespace where broker from removed-cluster fails to unload topic and namespace bundle still loaded with the broker. It happens when broker from removed-cluster receives below error
```
17:38:52.199 [pulsar-io-22-28] ERROR org.apache.pulsar.broker.service.persistent.PersistentReplicator - [persistent://prop/global/ns/tp1][east -> west] Failed to close dispatch rate limiter: org.apache.pulsar.client.api.PulsarClientException: Producer was not registered on the connection
:
17:38:52.199 [pulsar-io-22-28] WARN  org.apache.pulsar.broker.service.AbstractReplicator - [persistent://prop/global/ns/tp1][east -> west]] Exception: 'org.apache.pulsar.client.api.PulsarClientException: Producer was not registered on the connection' occured while trying to close the producer. retrying again in 0.1 s
:
17:38:52.351 [pulsar-io-22-37] ERROR org.apache.pulsar.broker.service.persistent.PersistentTopic - [persistent://prop/global/ns/tp1] Error closing topic
java.util.concurrent.CompletionException: org.apache.pulsar.client.api.PulsarClientException: Producer was not registered on the connection
        at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:331) ~[?:?]
```

### Modification
Source Broker should return explicit error-code when producer is already closed and dest-broker from removed-cluster should handle this error and clean up the replicator and topic gracefully.
lhotari pushed a commit to datastax/pulsar that referenced this pull request Dec 7, 2021
Cherry pick from apache#7473.

apache#7473 has fix the `Cleanup already deleted namespace topics` issue, but with apache#8129 involved, changes have been
changed back.

### Motivation
We are having frequent issues when user removes cluster from the global namespace where broker from removed-cluster fails to unload topic and namespace bundle still loaded with the broker. It happens when broker from removed-cluster receives below error
```
17:38:52.199 [pulsar-io-22-28] ERROR org.apache.pulsar.broker.service.persistent.PersistentReplicator - [persistent://prop/global/ns/tp1][east -> west] Failed to close dispatch rate limiter: org.apache.pulsar.client.api.PulsarClientException: Producer was not registered on the connection
:
17:38:52.199 [pulsar-io-22-28] WARN  org.apache.pulsar.broker.service.AbstractReplicator - [persistent://prop/global/ns/tp1][east -> west]] Exception: 'org.apache.pulsar.client.api.PulsarClientException: Producer was not registered on the connection' occured while trying to close the producer. retrying again in 0.1 s
:
17:38:52.351 [pulsar-io-22-37] ERROR org.apache.pulsar.broker.service.persistent.PersistentTopic - [persistent://prop/global/ns/tp1] Error closing topic
java.util.concurrent.CompletionException: org.apache.pulsar.client.api.PulsarClientException: Producer was not registered on the connection
        at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:331) ~[?:?]
```

### Modification
Source Broker should return explicit error-code when producer is already closed and dest-broker from removed-cluster should handle this error and clean up the replicator and topic gracefully.

(cherry picked from commit 6b3fb41)
(cherry picked from commit 434fbcd)
codelipenghui pushed a commit that referenced this pull request Apr 19, 2022
Cherry pick from #7473.

#7473 has fix the `Cleanup already deleted namespace topics` issue, but with #8129 involved, changes have been
changed back.

### Motivation
We are having frequent issues when user removes cluster from the global namespace where broker from removed-cluster fails to unload topic and namespace bundle still loaded with the broker. It happens when broker from removed-cluster receives below error
```
17:38:52.199 [pulsar-io-22-28] ERROR org.apache.pulsar.broker.service.persistent.PersistentReplicator - [persistent://prop/global/ns/tp1][east -> west] Failed to close dispatch rate limiter: org.apache.pulsar.client.api.PulsarClientException: Producer was not registered on the connection
:
17:38:52.199 [pulsar-io-22-28] WARN  org.apache.pulsar.broker.service.AbstractReplicator - [persistent://prop/global/ns/tp1][east -> west]] Exception: 'org.apache.pulsar.client.api.PulsarClientException: Producer was not registered on the connection' occured while trying to close the producer. retrying again in 0.1 s
:
17:38:52.351 [pulsar-io-22-37] ERROR org.apache.pulsar.broker.service.persistent.PersistentTopic - [persistent://prop/global/ns/tp1] Error closing topic
java.util.concurrent.CompletionException: org.apache.pulsar.client.api.PulsarClientException: Producer was not registered on the connection
        at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:331) ~[?:?]
```

### Modification
Source Broker should return explicit error-code when producer is already closed and dest-broker from removed-cluster should handle this error and clean up the replicator and topic gracefully.

(cherry picked from commit 6b3fb41)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants