Skip to content

Conversation

@Ngone51
Copy link
Member

@Ngone51 Ngone51 commented Dec 9, 2022

What changes were proposed in this pull request?

This PR introduces a new layer IsolatedThreadSafeRpcEndpoint to extend IsolatedRpcEndpoint and changes all the endpoints which currently extend IsolatedRpcEndpoint to extend IsolatedThreadSafeRpcEndpoint instead. Besides, IsolatedThreadSafeRpcEndpoint has overridden def threadCount(): Int = 1 with final to avoid mistakenly having thread count > 1 for the thread-safe endpoint.

Why are the changes needed?

Busy endpoints like DriverEndpoint, BlockManagerMasterEndpoint were known to be thread-safe when they extended ThreadSafeRpcEndpoint. After we introduce IsolatedRpcEndpoint in SPARK-29398, all these busy endpoints have been changed to extend IsolatedRpcEndpoint. These busy endpoints are still thread-safe since IsolatedRpcEndpoint by default only uses 1 thread. But that's not explicit to developers now compared to ThreadSafeRpcEndpoint (I actually find some people became to be confused about whether the endpoint is thread-safe or not). From time to time, IsolatedRpcEndpoint#threadCount might be mistakenly overridden with threadCount > 1 leading to thread-safety being broken. So this PR refactor here a bit to avoid the potential issue.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

N/A

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about renaming IsolatedRpcEndpoint to IsolatedThreadSafeRpcEndpoint simply? It's hard for me to imagine IsolatedButNotThreadSafeRpcEndpoint.

@Ngone51
Copy link
Member Author

Ngone51 commented Dec 9, 2022

What about renaming IsolatedRpcEndpoint to IsolatedThreadSafeRpcEndpoint simply?

I think this would breach the original design by the author argued at #26059 (comment). I don't want to break it, though I also don't have a use case of IsolatedButNotThreadSafeRpcEndpoint in my mind.

/**
* Limit the threadCount to 1 so that messages are ensured to be handled in a thread-safe way.
*/
final def threadCount(): Int = 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the PR description, this final def is the main contribution, @Ngone51 ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, and also the naming of IsolatedThreadSafeRpcEndpoint.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM.

@dongjoon-hyun dongjoon-hyun changed the title [SPARK-41460][CORE] Introduce IsolatedThreadSafeRpcEndpoint to extend IsolatedRpcEndpoint [SPARK-41460][CORE] Introduce IsolatedThreadSafeRpcEndpoint to extend IsolatedRpcEndpoint Dec 10, 2022
Copy link
Contributor

@mridulm mridulm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.
With this change, IsolatedRpcEndpoint is essentially a noop - other than as a marker interface, right ? Wondering if we want to remove it entirely

@dongjoon-hyun
Copy link
Member

According to @Ngone51 's comment, he seems to keep it as the original design concept. Otherwise, this PR should be a simple renaming like my comment.

@dongjoon-hyun
Copy link
Member

Let me merge this to Apache Spark 3.4.0 because this PR is conservative and we can remove the unused one later, @Ngone51 and @mridulm .

@Ngone51
Copy link
Member Author

Ngone51 commented Dec 12, 2022

Thanks @dongjoon-hyun @mridulm

beliefer pushed a commit to beliefer/spark that referenced this pull request Dec 18, 2022
…nd `IsolatedRpcEndpoint`

### What changes were proposed in this pull request?

This PR introduces a new layer `IsolatedThreadSafeRpcEndpoint` to extend `IsolatedRpcEndpoint` and changes all the endpoints which currently extend `IsolatedRpcEndpoint` to extend `IsolatedThreadSafeRpcEndpoint` instead. Besides, `IsolatedThreadSafeRpcEndpoint` has overridden `def threadCount(): Int = 1` with `final` to avoid mistakenly having thread count > 1 for the thread-safe endpoint.

### Why are the changes needed?

Busy endpoints like `DriverEndpoint`, `BlockManagerMasterEndpoint` were known to be thread-safe when they extended `ThreadSafeRpcEndpoint`. After we introduce `IsolatedRpcEndpoint` in [SPARK-29398](https://issues.apache.org/jira/browse/SPARK-29398),  all these busy endpoints have been changed to extend `IsolatedRpcEndpoint`. These busy endpoints are still thread-safe since `IsolatedRpcEndpoint` by default only uses 1 thread. But that's not explicit to developers now compared to `ThreadSafeRpcEndpoint` (I actually find some people became to be confused about whether the endpoint is thread-safe or not). From time to time, `IsolatedRpcEndpoint#threadCount` might be mistakenly overridden with threadCount > 1 leading to thread-safety being broken. So this PR refactor here a bit to avoid the potential issue.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

N/A

Closes apache#38995 from Ngone51/isolated.

Authored-by: Yi Wu <yi.wu@databricks.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants