refactor(strings): Improve and correct KMP implementation #13694
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Describe your change:
This pull request refactors and significantly improves the existing Knuth-Morris-Pratt (KMP) algorithm implementation. The primary goal is to correct the algorithm's functionality to be more robust and to align the code more closely with the project's contribution guidelines.
Key improvements include:
Corrected Functionality: The algorithm is updated to find all occurrences of the pattern and return a list[int], instead of only finding the first occurrence and returning an int. This fixes the core behavior to be more general-purpose and useful in a library context.
Standardized Testing: All assert statements from the if name == "main" block have been converted into proper doctests. This ensures all test cases are automatically run by the CI pipeline as required by the contribution guidelines.
Improved Documentation: The docstrings for both the main search function and the LPS helper function have been enhanced to provide clearer explanations of the logic, parameters, and return values.
Enhanced Readability: The helper function get_failure_array has been renamed to the more standard and descriptive name _compute_lps_array to improve code clarity.
[ ] Add an algorithm?
[x] Fix a bug or typo in an existing algorithm?
[x] Add or change doctests?
[x] Documentation change?
Checklist:
[x] I have read CONTRIBUTING.md.
[x] This pull request is all my own work -- I have not plagiarized.
[x] I know that pull requests will not be merged if they fail the automated tests.
[x] This PR only changes one algorithm file. To ease review, please open separate PRs for separate algorithms.
[x] All new Python files are placed inside an existing directory.
[x] All filenames are in all lowercase characters with no spaces or dashes.
[x] All functions and variable names follow Python naming conventions.
[x] All function parameters and return values are annotated with Python type hints.
[x] All functions have doctests that pass the automated testing.
[x] All new algorithms include at least one URL that points to Wikipedia or another similar explanation.
[ ] If this pull request resolves one or more open issues then the description above includes the issue number(s) with a closing keyword: "Fixes #ISSUE-NUMBER".