Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix sorts/bucket_sort.py implementation #5786

Merged
merged 5 commits into from
Aug 18, 2023

Conversation

drinkertea
Copy link
Contributor

Describe your change:

Two issues:

  • Current implementation is equal to sorted function, because all the items goes into the same bucket.

  • Buckets count is equal to integer difference between min and max, the problem illustrated by a new test-case. In general, we don't know the type of data, so bucket_count should be set outside. Or can be replaced by some heuristic like len(my_list) * 2.

  • Add an algorithm?

  • Fix a bug or typo in an existing algorithm?

  • Documentation change?

Checklist:

  • I have read CONTRIBUTING.md.
  • This pull request is all my own work -- I have not plagiarized.
  • I know that pull requests will not be merged if they fail the automated tests.
  • This PR only changes one algorithm file. To ease review, please open separate PRs for separate algorithms.
  • All new Python files are placed inside an existing directory.
  • All filenames are in all lowercase characters with no spaces or dashes.
  • All functions and variable names follow Python naming conventions.
  • All function parameters and return values are annotated with Python type hints.
  • All functions have doctests that pass the automated testing.
  • All new algorithms have a URL in its comments that points to Wikipedia or other similar explanation.
  • If this pull request resolves one or more open issues then the commit message contains Fixes: #{$ISSUE_NO}.

@ghost ghost added awaiting reviews This PR is ready to be reviewed enhancement This PR modified some existing files labels Nov 6, 2021
@@ -30,7 +30,7 @@
from __future__ import annotations


def bucket_sort(my_list: list) -> list:
def bucket_sort(my_list: list, bucket_count: int = 10) -> list:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why bucket count is set to 10?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some default value. It can be 100 or whatever.
Or we can set it to None by default and calculate it by some heuristic if it's not set, ex. len(my_list) // 2.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest that rather this having bucket_count as a parameter, we can have it as part of implementation logic where we calculate bucket count based on type of machine 32 for 32bit, 64 for 64bit. In that way we eliminate user driven errors

Copy link

@sidoknowia sidoknowia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! I provided few suggestions in comments, please let me know if you have any questions.

buckets: list[list] = [[] for _ in range(bucket_count)]

for i in range(len(my_list)):
buckets[(int(my_list[i] - min_value) // bucket_count)].append(my_list[i])
index = min(int((my_list[i] - min_value) / bucket_size), bucket_count - 1)
buckets[index].append(my_list[i])

return [v for bucket in buckets for v in sorted(bucket)]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we recursively call bucket sort rather than relying on sorted method? In that way we can eliminate dependence on inbuilt python methods

''' pseudo-code
return [v for bucket in buckets for v in bucket_sort(bucket)]
'''

@@ -30,7 +30,7 @@
from __future__ import annotations


def bucket_sort(my_list: list) -> list:
def bucket_sort(my_list: list, bucket_count: int = 10) -> list:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very nit-picky / low-priority -
Can we change variable name to something more descriptive? Maybe have list_to_sort instead of my_list?

@@ -30,7 +30,7 @@
from __future__ import annotations


def bucket_sort(my_list: list) -> list:
def bucket_sort(my_list: list, bucket_count: int = 10) -> list:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest that rather this having bucket_count as a parameter, we can have it as part of implementation logic where we calculate bucket count based on type of machine 32 for 32bit, 64 for 64bit. In that way we eliminate user driven errors

@stale
Copy link

stale bot commented Apr 19, 2022

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale Used to mark an issue or pull request stale. label Apr 19, 2022
Copy link
Contributor

@tianyizheng02 tianyizheng02 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the first issue you mentioned has already been fixed in #6005, so there's now a merge conflict with this PR. However, I still think that making bucket_count a function parameter is a good change, so I'll try to fix the merge conflict and merge this PR.

@stale stale bot removed the stale Used to mark an issue or pull request stale. label Aug 18, 2023
@algorithms-keeper algorithms-keeper bot added the tests are failing Do not merge until tests pass label Aug 18, 2023
@algorithms-keeper algorithms-keeper bot removed the tests are failing Do not merge until tests pass label Aug 18, 2023
@tianyizheng02 tianyizheng02 merged commit 72c7b05 into TheAlgorithms:master Aug 18, 2023
@algorithms-keeper algorithms-keeper bot removed the awaiting reviews This PR is ready to be reviewed label Aug 18, 2023
sedatguzelsemme pushed a commit to sedatguzelsemme/Python that referenced this pull request Sep 15, 2024
* Fix sorts/bucket_sort.py

* updating DIRECTORY.md

* Remove unused var in bucket_sort.py

* Fix list index in bucket_sort.py

---------

Co-authored-by: Tianyi Zheng <tianyizheng02@gmail.com>
Co-authored-by: github-actions <${GITHUB_ACTOR}@users.noreply.github.com>
@isidroas isidroas mentioned this pull request Jan 25, 2025
14 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement This PR modified some existing files
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants