Skip to content

Conversation

@gaogaotiantian
Copy link
Contributor

What changes were proposed in this pull request?

Always pass runnerConf to python worker, even if it's not used.

Why are the changes needed?

This is part of the effort to consolidate our protocol from JVM to the worker. We have different ways to pass the runner conf now and sometimes we just don't pass it. It makes the worker side code a bit messy - we need to determine whether to read the conf based on eval type. However reading an empty conf is super cheap and we can just do it regardless.

With this infra, vanilla python udfs can also pass some runner conf in the future. We can do some refactoring on our JVM worker code in the future.

Does this PR introduce any user-facing change?

No

How was this patch tested?

pyspark-sql passed locally. Running the rest on CI.

Was this patch authored or co-authored using generative AI tooling?

No

@gaogaotiantian gaogaotiantian marked this pull request as ready for review December 6, 2025 04:00
@HyukjinKwon HyukjinKwon changed the title [SPARK-54615] Always pass runner_conf to python worker [SPARK-54615][PYTHON] Always pass runner_conf to python worker Dec 7, 2025
@HyukjinKwon
Copy link
Member

Merged to master.

xu20160924 pushed a commit to xu20160924/spark that referenced this pull request Dec 9, 2025
### What changes were proposed in this pull request?

Always pass runnerConf to python worker, even if it's not used.

### Why are the changes needed?

This is part of the effort to consolidate our protocol from JVM to the worker. We have different ways to pass the runner conf now and sometimes we just don't pass it. It makes the worker side code a bit messy - we need to determine whether to read the conf based on eval type. However reading an empty conf is super cheap and we can just do it regardless.

With this infra, vanilla python udfs can also pass some runner conf in the future. We can do some refactoring on our JVM worker code in the future.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

`pyspark-sql` passed locally. Running the rest on CI.

### Was this patch authored or co-authored using generative AI tooling?

No

Closes apache#53353 from gaogaotiantian/always-pass-runnerconf.

Authored-by: Tian Gao <gaogaotiantian@hotmail.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
@gaogaotiantian gaogaotiantian deleted the always-pass-runnerconf branch December 19, 2025 00:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants