-
-
Notifications
You must be signed in to change notification settings - Fork 46.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix sorts/bucket_sort.py
implementation
#5786
Fix sorts/bucket_sort.py
implementation
#5786
Conversation
@@ -30,7 +30,7 @@ | |||
from __future__ import annotations | |||
|
|||
|
|||
def bucket_sort(my_list: list) -> list: | |||
def bucket_sort(my_list: list, bucket_count: int = 10) -> list: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why bucket count is set to 10?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some default value. It can be 100 or whatever.
Or we can set it to None
by default and calculate it by some heuristic if it's not set, ex. len(my_list) // 2
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest that rather this having bucket_count as a parameter, we can have it as part of implementation logic where we calculate bucket count based on type of machine 32 for 32bit, 64 for 64bit. In that way we eliminate user driven errors
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work! I provided few suggestions in comments, please let me know if you have any questions.
sorts/bucket_sort.py
Outdated
buckets: list[list] = [[] for _ in range(bucket_count)] | ||
|
||
for i in range(len(my_list)): | ||
buckets[(int(my_list[i] - min_value) // bucket_count)].append(my_list[i]) | ||
index = min(int((my_list[i] - min_value) / bucket_size), bucket_count - 1) | ||
buckets[index].append(my_list[i]) | ||
|
||
return [v for bucket in buckets for v in sorted(bucket)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we recursively call bucket sort rather than relying on sorted method? In that way we can eliminate dependence on inbuilt python methods
''' pseudo-code
return [v for bucket in buckets for v in bucket_sort(bucket)]
'''
@@ -30,7 +30,7 @@ | |||
from __future__ import annotations | |||
|
|||
|
|||
def bucket_sort(my_list: list) -> list: | |||
def bucket_sort(my_list: list, bucket_count: int = 10) -> list: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
very nit-picky / low-priority -
Can we change variable name to something more descriptive? Maybe have list_to_sort instead of my_list?
@@ -30,7 +30,7 @@ | |||
from __future__ import annotations | |||
|
|||
|
|||
def bucket_sort(my_list: list) -> list: | |||
def bucket_sort(my_list: list, bucket_count: int = 10) -> list: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest that rather this having bucket_count as a parameter, we can have it as part of implementation logic where we calculate bucket count based on type of machine 32 for 32bit, 64 for 64bit. In that way we eliminate user driven errors
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the first issue you mentioned has already been fixed in #6005, so there's now a merge conflict with this PR. However, I still think that making bucket_count
a function parameter is a good change, so I'll try to fix the merge conflict and merge this PR.
* Fix sorts/bucket_sort.py * updating DIRECTORY.md * Remove unused var in bucket_sort.py * Fix list index in bucket_sort.py --------- Co-authored-by: Tianyi Zheng <tianyizheng02@gmail.com> Co-authored-by: github-actions <${GITHUB_ACTOR}@users.noreply.github.com>
Describe your change:
Two issues:
Current implementation is equal to
sorted
function, because all the items goes into the same bucket.Buckets count is equal to integer difference between
min
andmax
, the problem illustrated by a new test-case. In general, we don't know the type of data, sobucket_count
should be set outside. Or can be replaced by some heuristic likelen(my_list) * 2
.Add an algorithm?
Fix a bug or typo in an existing algorithm?
Documentation change?
Checklist:
Fixes: #{$ISSUE_NO}
.