-
Notifications
You must be signed in to change notification settings - Fork 29k
[Spark Core][MINOR] fix "default partitioner cannot partition array keys" error message in PairRDDfunctions #15045
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Spark Core][MINOR] fix "default partitioner cannot partition array keys" error message in PairRDDfunctions #15045
Conversation
|
There are 5 instances of this check in the file -- they should all be handled the same way. I'm not sure this is accurate either because some code paths lead to these methods when HashPartitioner is used as a default. Just say that HashPartitioner can't be used? refactor one check method for this? |
|
oh, there are 5 similar messages.. |
|
Test build #65206 has finished for PR 15045 at commit
|
|
Test build #65207 has finished for PR 15045 at commit
|
|
Why bother saying 'specified' or 'default' at all though? it's probably even more informative to state that HashPartitioner doesn't work, no matter what the source. If the user specified HashPartitioner, that's clear. If they didn't, they'll still recognize that the other half of the message is relevant: some thing doesn't like their array keys. |
| def partitionBy(partitioner: Partitioner): RDD[(K, V)] = self.withScope { | ||
| if (keyClass.isArray && partitioner.isInstanceOf[HashPartitioner]) { | ||
| throw new SparkException("Default partitioner cannot partition array keys.") | ||
| throw new SparkException("Specified partitioner cannot partition array keys.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even this method is called by other Spark code with HashPartitioner, which might lead to this error telling the user the "specified" partitioner doesn't work when the user code didn't specify a partitioner.
|
Test build #65229 has finished for PR 15045 at commit
|
|
Test build #65227 has finished for PR 15045 at commit
|
|
jenkins test please |
|
Jenkins, test this please. |
|
Test build #65236 has finished for PR 15045 at commit
|
|
OK, can't hurt to be clear and specific about this. Merged to master |
…eys" error message in PairRDDfunctions ## What changes were proposed in this pull request? In order to avoid confusing user, error message in `PairRDDfunctions` `Default partitioner cannot partition array keys.` is updated, the one in `partitionBy` is replaced with `Specified partitioner cannot partition array keys.` other is replaced with `Specified or default partitioner cannot partition array keys.` ## How was this patch tested? N/A Author: WeichenXu <WeichenXu123@outlook.com> Closes apache#15045 from WeichenXu123/fix_partitionBy_error_message.
What changes were proposed in this pull request?
In order to avoid confusing user,
error message in
PairRDDfunctionsDefault partitioner cannot partition array keys.is updated,
the one in
partitionByis replaced withSpecified partitioner cannot partition array keys.other is replaced with
Specified or default partitioner cannot partition array keys.How was this patch tested?
N/A