-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-17521]Error when I use sparkContext.makeRDD(Seq()) #15077
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| assertNotStopped() | ||
| val indexToPrefs = seq.zipWithIndex.map(t => (t._2, t._1._2)).toMap | ||
| new ParallelCollectionRDD[T](this, seq.map(_._1), seq.size, indexToPrefs) | ||
| new ParallelCollectionRDD[T](this, seq.map(_._1), math.max(seq.size, defaultParallelism), indexToPrefs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say math.max(seq.size, 1). Really this method would normally just use the provided partition count (called "numSlices" in this old API) but this one doesn't have that parameter, which is more reason it's an odd man out. Still I think the most reasonable behavior is to use at least 1 partition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To keep the same with sc.parallelize, I think the defalutParallelism is reasonable,
to let the code below be same, I think we should use defaultParallelism,
val rdd = sc.makeRDD(Seq())
val rdd = sc.parallelize(Seq())
but which one to use, you can make a decision.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is that the default is OK because it's changeable, but here someone has no way to change it. I think it might be better to stay conservative.
Really this is such a corner case that it doesn't matter much. It only showed up for you because you specified no type on your Seq. If you had, it would have chosen the other overload which works fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok ,thanks for your explain.
|
Jenkins test this please |
|
Test build #65362 has finished for PR 15077 at commit
|
|
Merged to master/2.0 |
## What changes were proposed in this pull request?
when i use sc.makeRDD below
```
val data3 = sc.makeRDD(Seq())
println(data3.partitions.length)
```
I got an error:
Exception in thread "main" java.lang.IllegalArgumentException: Positive number of slices required
We can fix this bug just modify the last line ,do a check of seq.size
```
def makeRDD[T: ClassTag](seq: Seq[(T, Seq[String])]): RDD[T] = withScope {
assertNotStopped()
val indexToPrefs = seq.zipWithIndex.map(t => (t._2, t._1._2)).toMap
new ParallelCollectionRDD[T](this, seq.map(_._1), math.max(seq.size, defaultParallelism), indexToPrefs)
}
```
## How was this patch tested?
manual tests
(If this patch involves UI changes, please attach a screenshot; otherwise, remove this)
Author: codlife <1004910847@qq.com>
Author: codlife <wangjianfei15@otcaix.iscas.ac.cn>
Closes #15077 from codlife/master.
(cherry picked from commit 647ee05)
Signed-off-by: Sean Owen <sowen@cloudera.com>
## What changes were proposed in this pull request?
when i use sc.makeRDD below
```
val data3 = sc.makeRDD(Seq())
println(data3.partitions.length)
```
I got an error:
Exception in thread "main" java.lang.IllegalArgumentException: Positive number of slices required
We can fix this bug just modify the last line ,do a check of seq.size
```
def makeRDD[T: ClassTag](seq: Seq[(T, Seq[String])]): RDD[T] = withScope {
assertNotStopped()
val indexToPrefs = seq.zipWithIndex.map(t => (t._2, t._1._2)).toMap
new ParallelCollectionRDD[T](this, seq.map(_._1), math.max(seq.size, defaultParallelism), indexToPrefs)
}
```
## How was this patch tested?
manual tests
(If this patch involves UI changes, please attach a screenshot; otherwise, remove this)
Author: codlife <1004910847@qq.com>
Author: codlife <wangjianfei15@otcaix.iscas.ac.cn>
Closes apache#15077 from codlife/master.
What changes were proposed in this pull request?
when i use sc.makeRDD below
I got an error:
Exception in thread "main" java.lang.IllegalArgumentException: Positive number of slices required
We can fix this bug just modify the last line ,do a check of seq.size
How was this patch tested?
manual tests
(If this patch involves UI changes, please attach a screenshot; otherwise, remove this)