You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently MapReduce is implemented such a way that if you partition a dataset into number of partitions that does not divide data then you would get partitions of unequal sizes, the current implementation repeats the last data point to fill the partitions, would it make more sense to use a random datapoint instead?
The text was updated successfully, but these errors were encountered:
What's the issue?
Currently MapReduce is implemented such a way that if you partition a dataset into number of partitions that does not divide data then you would get partitions of unequal sizes, the current implementation repeats the last data point to fill the partitions, would it make more sense to use a random datapoint instead?
The text was updated successfully, but these errors were encountered: