Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[improve][broker] GetPartitionMetadata fail also can produce messages #23050

Conversation

congbobo184
Copy link
Contributor

@congbobo184 congbobo184 commented Jul 18, 2024

Motivation

GetPartitionMetadata fail also can produce messages

  • 'autoUpdatePartitionsInterval' will get partition metadata and will regularly detect partition changes

  • if GetPartitionMetadata will return ServiceNotReady, client receive ServiceNotReady will close cnx

  • if close the current cnx, all producers and consumers witch use this cnx will close and reconnect

(

public void channelInactive(ChannelHandlerContext ctx) throws Exception {
super.channelInactive(ctx);
connectionsClosedCounter.increment();
lastDisconnectedTimestamp = System.currentTimeMillis();
log.info("{} Disconnected", ctx.channel());
if (!connectionFuture.isDone()) {
connectionFuture.completeExceptionally(new PulsarClientException("Connection already closed"));
}
ConnectException e = new ConnectException(
"Disconnected from server at " + ctx.channel().remoteAddress());
// Fail out all the pending ops
pendingRequests.forEach((key, future) -> {
if (pendingRequests.remove(key, future) && !future.isDone()) {
future.completeExceptionally(e);
}
});
waitingLookupRequests.forEach(pair -> pair.getRight().getRight().completeExceptionally(e));
// Notify all attached producers/consumers so they have a chance to reconnect
producers.forEach((id, producer) -> producer.connectionClosed(this, Optional.empty(), Optional.empty()));
consumers.forEach((id, consumer) -> consumer.connectionClosed(this, Optional.empty(), Optional.empty()));
)

  • this will influence a lot of producers and consumers and if current time the zk not available and bundle cache not exist the topic's bundle metadata, the client can't send messages to broker because the producer lookup will fail

Modifications

GetPartitionMetadata return MetadataError when throw MetadataStoreException

Verifying this change

  • Make sure that the change passes the CI checks.

(Please pick either of the following options)

This change is a trivial rework / code cleanup without any test coverage.

(or)

This change is already covered by existing tests, such as (please describe tests).

(or)

This change added tests and can be verified as follows:

(example:)

  • Added integration tests for end-to-end deployment with large payloads (10MB)
  • Extended integration test for recovery after broker failure

Does this pull request potentially affect one of the following parts:

If the box was checked, please highlight the changes

  • Dependencies (add or upgrade a dependency)
  • The public API
  • The schema
  • The default values of configurations
  • The threading model
  • The binary protocol
  • The REST endpoints
  • The admin CLI options
  • The metrics
  • Anything that affects deployment

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

Matching PR in forked repository

PR in forked repository:

Copy link

@congbobo184 Please add the following content to your PR description and select a checkbox:

- [ ] `doc` <!-- Your PR contains doc changes -->
- [ ] `doc-required` <!-- Your PR changes impact docs and you will update later -->
- [ ] `doc-not-needed` <!-- Your PR changes do not impact docs -->
- [ ] `doc-complete` <!-- Docs have been already added -->

…isconnect_partition_topic_also_can_send_message
…isconnect_partition_topic_also_can_send_message
@codecov-commenter
Copy link

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 73.46%. Comparing base (bbc6224) to head (6a937ce).
Report is 461 commits behind head on master.

Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff              @@
##             master   #23050      +/-   ##
============================================
- Coverage     73.57%   73.46%   -0.12%     
- Complexity    32624    33187     +563     
============================================
  Files          1877     1915      +38     
  Lines        139502   143945    +4443     
  Branches      15299    15726     +427     
============================================
+ Hits         102638   105746    +3108     
- Misses        28908    30093    +1185     
- Partials       7956     8106     +150     
Flag Coverage Δ
inttests 27.62% <0.00%> (+3.03%) ⬆️
systests 24.80% <0.00%> (+0.47%) ⬆️
unittests 72.51% <100.00%> (-0.33%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
...va/org/apache/pulsar/broker/service/ServerCnx.java 72.50% <100.00%> (+0.35%) ⬆️

... and 511 files with indirect coverage changes

@congbobo184 congbobo184 merged commit 6fa3bcf into apache:master Jul 22, 2024
53 checks passed
Technoboy- pushed a commit that referenced this pull request Jul 25, 2024
…#23050)

### Motivation

GetPartitionMetadata fail also can produce messages

- 'autoUpdatePartitionsInterval' will get partition metadata and will regularly detect partition changes

- if GetPartitionMetadata will return ServiceNotReady, client receive ServiceNotReady will close cnx

- if close the current cnx, all producers and consumers witch use this cnx will close and reconnect

(https://github.com/apache/pulsar/blob/5c6602cbb3660a696bf960f2847aac1a2ae037d2/pulsar-client/src/main/java/org/apache/pulsar/client/impl/ClientCnx.java#L323-L345)
- this will influence a lot of producers and consumers and if current time the zk not available and bundle cache not exist the topic's bundle metadata, the client can't send messages to broker because the producer lookup will fail

### Modifications

GetPartitionMetadata return MetadataError when throw MetadataStoreException
lhotari pushed a commit that referenced this pull request Jul 29, 2024
…#23050)

### Motivation

GetPartitionMetadata fail also can produce messages

- 'autoUpdatePartitionsInterval' will get partition metadata and will regularly detect partition changes

- if GetPartitionMetadata will return ServiceNotReady, client receive ServiceNotReady will close cnx

- if close the current cnx, all producers and consumers witch use this cnx will close and reconnect

(https://github.com/apache/pulsar/blob/5c6602cbb3660a696bf960f2847aac1a2ae037d2/pulsar-client/src/main/java/org/apache/pulsar/client/impl/ClientCnx.java#L323-L345)
- this will influence a lot of producers and consumers and if current time the zk not available and bundle cache not exist the topic's bundle metadata, the client can't send messages to broker because the producer lookup will fail

### Modifications

GetPartitionMetadata return MetadataError when throw MetadataStoreException

(cherry picked from commit 6fa3bcf)
nikhil-ctds pushed a commit to datastax/pulsar that referenced this pull request Jul 29, 2024
…apache#23050)

### Motivation

GetPartitionMetadata fail also can produce messages

- 'autoUpdatePartitionsInterval' will get partition metadata and will regularly detect partition changes

- if GetPartitionMetadata will return ServiceNotReady, client receive ServiceNotReady will close cnx

- if close the current cnx, all producers and consumers witch use this cnx will close and reconnect

(https://github.com/apache/pulsar/blob/5c6602cbb3660a696bf960f2847aac1a2ae037d2/pulsar-client/src/main/java/org/apache/pulsar/client/impl/ClientCnx.java#L323-L345)
- this will influence a lot of producers and consumers and if current time the zk not available and bundle cache not exist the topic's bundle metadata, the client can't send messages to broker because the producer lookup will fail

### Modifications

GetPartitionMetadata return MetadataError when throw MetadataStoreException

(cherry picked from commit 6fa3bcf)
(cherry picked from commit 3b3e90b)
lhotari pushed a commit that referenced this pull request Jul 29, 2024
…#23050)

GetPartitionMetadata fail also can produce messages

- 'autoUpdatePartitionsInterval' will get partition metadata and will regularly detect partition changes

- if GetPartitionMetadata will return ServiceNotReady, client receive ServiceNotReady will close cnx

- if close the current cnx, all producers and consumers witch use this cnx will close and reconnect

(https://github.com/apache/pulsar/blob/5c6602cbb3660a696bf960f2847aac1a2ae037d2/pulsar-client/src/main/java/org/apache/pulsar/client/impl/ClientCnx.java#L323-L345)
- this will influence a lot of producers and consumers and if current time the zk not available and bundle cache not exist the topic's bundle metadata, the client can't send messages to broker because the producer lookup will fail

GetPartitionMetadata return MetadataError when throw MetadataStoreException

(cherry picked from commit 6fa3bcf)
srinath-ctds pushed a commit to datastax/pulsar that referenced this pull request Jul 30, 2024
…apache#23050)

### Motivation

GetPartitionMetadata fail also can produce messages

- 'autoUpdatePartitionsInterval' will get partition metadata and will regularly detect partition changes

- if GetPartitionMetadata will return ServiceNotReady, client receive ServiceNotReady will close cnx

- if close the current cnx, all producers and consumers witch use this cnx will close and reconnect

(https://github.com/apache/pulsar/blob/5c6602cbb3660a696bf960f2847aac1a2ae037d2/pulsar-client/src/main/java/org/apache/pulsar/client/impl/ClientCnx.java#L323-L345)
- this will influence a lot of producers and consumers and if current time the zk not available and bundle cache not exist the topic's bundle metadata, the client can't send messages to broker because the producer lookup will fail

### Modifications

GetPartitionMetadata return MetadataError when throw MetadataStoreException

(cherry picked from commit 6fa3bcf)
(cherry picked from commit 3b3e90b)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants