Partitioning in MapReduce #407

qh681248 · 2024-09-26T13:15:57Z

What's the issue?

Currently MapReduce is implemented such a way that if you partition a dataset into number of partitions that does not divide data then you would get partitions of unequal sizes, the current implementation repeats the last data point to fill the partitions, would it make more sense to use a random datapoint instead?

qh681248 added new Something yet to be discussed by development team question Further information is requested labels Sep 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Partitioning in MapReduce #407

Partitioning in MapReduce #407

qh681248 commented Sep 26, 2024

Partitioning in MapReduce #407

Partitioning in MapReduce #407

Comments

qh681248 commented Sep 26, 2024

What's the issue?