[improve][broker] Don't use forkjoin pool by default for deleting partitioned topics #22598

lhotari · 2024-04-26T06:59:59Z

Motivation

While investigating some recent test failures and Pulsar CI OOME issues, #22588, the heap dumps revealed that
the ForkJoin pool's work queue is retaining a lot of CompletableFuture related instances.

The heap dump confirms that these are originating from NamespaceResources.PartitionedTopicResources#runWithMarkDeleteAsync method in this location:

pulsar/pulsar-broker-common/src/main/java/org/apache/pulsar/broker/resources/NamespaceResources.java

Lines 345 to 374 in bbdc173

    
           markPartitionedTopicDeletedAsync(topic).whenCompleteAsync((markResult, markExc) -> { 
        
               final boolean mdFound; 
        
               if (markExc != null) { 
        
                   if (markExc.getCause() instanceof MetadataStoreException.NotFoundException) { 
        
                       mdFound = false; 
        
                   } else { 
        
                       log.error("Failed to mark the topic {} as deleted", topic, markExc); 
        
                       future.completeExceptionally(markExc); 
        
                       return; 
        
                   } 
        
               } else { 
        
                   mdFound = true; 
        
               } 
        
               supplier.get().whenComplete((deleteResult, deleteExc) -> { 
        
                   if (deleteExc != null && mdFound) { 
        
                       unmarkPartitionedTopicDeletedAsync(topic) 
        
                               .thenRun(() -> future.completeExceptionally(deleteExc)) 
        
                               .exceptionally(ex -> { 
        
                                   log.warn("Failed to unmark the topic {} as deleted", topic, ex); 
        
                                   future.completeExceptionally(deleteExc); 
        
                                   return null; 
        
                               }); 
        
                   } else if (deleteExc != null) { 
        
                       future.completeExceptionally(deleteExc); 
        
                   } else { 
        
                       future.complete(deleteResult); 
        
                   } 
        
               }); 
        
           });

The assumption is that graceful shutdown in tests without causing memory leaks will be more achievable with the provided executor since the instance isn't shared like the Fork join common pool executor.

Modifications

Instead of using the forkjoin pool by default in NamespaceResources.PartitionedTopicResources#runWithMarkDeleteAsync, use a specific executor. For Pulsar broker, this would be the executor returned by PulsarService#getExecutor()

Documentation

doc
doc-required
doc-not-needed
doc-complete

heesung-sn

Looks good to me, but this could impact the performance. I would check with others if there are any performance concerns here.

…titioned topics in NamespaceResources.PartitionedTopicResources#runWithMarkDeleteAsync

…titioned topics (#22598) (cherry picked from commit 8323a3c)

…titioned topics (apache#22598) (cherry picked from commit 8323a3c) (cherry picked from commit 489628d)

lhotari added type/flaky-tests ready-to-test labels Apr 26, 2024

lhotari added this to the 3.3.0 milestone Apr 26, 2024

lhotari requested review from poorbarcode and heesung-sn April 26, 2024 06:59

lhotari self-assigned this Apr 26, 2024

github-actions bot added the doc-not-needed Your PR changes do not impact docs label Apr 26, 2024

heesung-sn approved these changes Apr 26, 2024

View reviewed changes

dao-jun approved these changes Apr 26, 2024

View reviewed changes

lhotari added 2 commits April 26, 2024 17:26

[improve][broker] Don't use forkjoin pool by default for deleting par…

42476e1

…titioned topics in NamespaceResources.PartitionedTopicResources#runWithMarkDeleteAsync

Add constructor for backwards compatibility with unit tests

490589e

lhotari force-pushed the lh-dont-use-forkjoin-pool-for-deleting branch from 2c56c79 to 490589e Compare April 26, 2024 14:32

lhotari merged commit 8323a3c into apache:master Apr 26, 2024
48 of 50 checks passed

lhotari added the release/3.0.8 label Nov 23, 2024

lhotari added a commit that referenced this pull request Nov 23, 2024

[improve][broker] Don't use forkjoin pool by default for deleting par…

489628d

…titioned topics (#22598) (cherry picked from commit 8323a3c)

lhotari added the cherry-picked/branch-3.0 label Nov 23, 2024

nikhil-ctds pushed a commit to datastax/pulsar that referenced this pull request Nov 26, 2024

[improve][broker] Don't use forkjoin pool by default for deleting par…

f4b15fa

…titioned topics (apache#22598) (cherry picked from commit 8323a3c) (cherry picked from commit 489628d)

srinath-ctds pushed a commit to datastax/pulsar that referenced this pull request Nov 26, 2024

[improve][broker] Don't use forkjoin pool by default for deleting par…

68d5c3b

…titioned topics (apache#22598) (cherry picked from commit 8323a3c) (cherry picked from commit 489628d)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[improve][broker] Don't use forkjoin pool by default for deleting partitioned topics #22598

[improve][broker] Don't use forkjoin pool by default for deleting partitioned topics #22598

lhotari commented Apr 26, 2024

heesung-sn left a comment

	markPartitionedTopicDeletedAsync(topic).whenCompleteAsync((markResult, markExc) -> {
	final boolean mdFound;
	if (markExc != null) {
	if (markExc.getCause() instanceof MetadataStoreException.NotFoundException) {
	mdFound = false;
	} else {
	log.error("Failed to mark the topic {} as deleted", topic, markExc);
	future.completeExceptionally(markExc);
	return;
	}
	} else {
	mdFound = true;
	}

	supplier.get().whenComplete((deleteResult, deleteExc) -> {
	if (deleteExc != null && mdFound) {
	unmarkPartitionedTopicDeletedAsync(topic)
	.thenRun(() -> future.completeExceptionally(deleteExc))
	.exceptionally(ex -> {
	log.warn("Failed to unmark the topic {} as deleted", topic, ex);
	future.completeExceptionally(deleteExc);
	return null;
	});
	} else if (deleteExc != null) {
	future.completeExceptionally(deleteExc);
	} else {
	future.complete(deleteResult);
	}
	});
	});

[improve][broker] Don't use forkjoin pool by default for deleting partitioned topics #22598

[improve][broker] Don't use forkjoin pool by default for deleting partitioned topics #22598

Conversation

lhotari commented Apr 26, 2024

Motivation

Modifications

Documentation

heesung-sn left a comment

Choose a reason for hiding this comment