Skip to content

Commit ec11300

Browse files
committed
Replace numTasks with numPartitions in programming guide
1 parent 6550086 commit ec11300

File tree

1 file changed

+8
-8
lines changed

1 file changed

+8
-8
lines changed

docs/rdd-programming-guide.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -978,40 +978,40 @@ for details.
978978
<td> Return a new RDD that contains the intersection of elements in the source dataset and the argument. </td>
979979
</tr>
980980
<tr>
981-
<td> <b>distinct</b>([<i>numTasks</i>])) </td>
981+
<td> <b>distinct</b>([<i>numPartitions</i>])) </td>
982982
<td> Return a new dataset that contains the distinct elements of the source dataset.</td>
983983
</tr>
984984
<tr>
985-
<td> <b>groupByKey</b>([<i>numTasks</i>]) <a name="GroupByLink"></a> </td>
985+
<td> <b>groupByKey</b>([<i>numPartitions</i>]) <a name="GroupByLink"></a> </td>
986986
<td> When called on a dataset of (K, V) pairs, returns a dataset of (K, Iterable&lt;V&gt;) pairs. <br />
987987
<b>Note:</b> If you are grouping in order to perform an aggregation (such as a sum or
988988
average) over each key, using <code>reduceByKey</code> or <code>aggregateByKey</code> will yield much better
989989
performance.
990990
<br />
991991
<b>Note:</b> By default, the level of parallelism in the output depends on the number of partitions of the parent RDD.
992-
You can pass an optional <code>numTasks</code> argument to set a different number of tasks.
992+
You can pass an optional <code>numPartitions</code> argument to set a different number of tasks.
993993
</td>
994994
</tr>
995995
<tr>
996-
<td> <b>reduceByKey</b>(<i>func</i>, [<i>numTasks</i>]) <a name="ReduceByLink"></a> </td>
996+
<td> <b>reduceByKey</b>(<i>func</i>, [<i>numPartitions</i>]) <a name="ReduceByLink"></a> </td>
997997
<td> When called on a dataset of (K, V) pairs, returns a dataset of (K, V) pairs where the values for each key are aggregated using the given reduce function <i>func</i>, which must be of type (V,V) => V. Like in <code>groupByKey</code>, the number of reduce tasks is configurable through an optional second argument. </td>
998998
</tr>
999999
<tr>
1000-
<td> <b>aggregateByKey</b>(<i>zeroValue</i>)(<i>seqOp</i>, <i>combOp</i>, [<i>numTasks</i>]) <a name="AggregateByLink"></a> </td>
1000+
<td> <b>aggregateByKey</b>(<i>zeroValue</i>)(<i>seqOp</i>, <i>combOp</i>, [<i>numPartitions</i>]) <a name="AggregateByLink"></a> </td>
10011001
<td> When called on a dataset of (K, V) pairs, returns a dataset of (K, U) pairs where the values for each key are aggregated using the given combine functions and a neutral "zero" value. Allows an aggregated value type that is different than the input value type, while avoiding unnecessary allocations. Like in <code>groupByKey</code>, the number of reduce tasks is configurable through an optional second argument. </td>
10021002
</tr>
10031003
<tr>
1004-
<td> <b>sortByKey</b>([<i>ascending</i>], [<i>numTasks</i>]) <a name="SortByLink"></a> </td>
1004+
<td> <b>sortByKey</b>([<i>ascending</i>], [<i>numPartitions</i>]) <a name="SortByLink"></a> </td>
10051005
<td> When called on a dataset of (K, V) pairs where K implements Ordered, returns a dataset of (K, V) pairs sorted by keys in ascending or descending order, as specified in the boolean <code>ascending</code> argument.</td>
10061006
</tr>
10071007
<tr>
1008-
<td> <b>join</b>(<i>otherDataset</i>, [<i>numTasks</i>]) <a name="JoinLink"></a> </td>
1008+
<td> <b>join</b>(<i>otherDataset</i>, [<i>numPartitions</i>]) <a name="JoinLink"></a> </td>
10091009
<td> When called on datasets of type (K, V) and (K, W), returns a dataset of (K, (V, W)) pairs with all pairs of elements for each key.
10101010
Outer joins are supported through <code>leftOuterJoin</code>, <code>rightOuterJoin</code>, and <code>fullOuterJoin</code>.
10111011
</td>
10121012
</tr>
10131013
<tr>
1014-
<td> <b>cogroup</b>(<i>otherDataset</i>, [<i>numTasks</i>]) <a name="CogroupLink"></a> </td>
1014+
<td> <b>cogroup</b>(<i>otherDataset</i>, [<i>numPartitions</i>]) <a name="CogroupLink"></a> </td>
10151015
<td> When called on datasets of type (K, V) and (K, W), returns a dataset of (K, (Iterable&lt;V&gt;, Iterable&lt;W&gt;)) tuples. This operation is also called <code>groupWith</code>. </td>
10161016
</tr>
10171017
<tr>

0 commit comments

Comments
 (0)