-
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
♻️ refactor encoding utilities #12
Conversation
WalkthroughThe changes update the encoding logic in the Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant EncodeUtils
Client->>EncodeUtils: escape(input_string)
EncodeUtils->>EncodeUtils: For each character in input_string
alt Character is safe
EncodeUtils->>EncodeUtils: Append character to output buffer
else Character is unsafe
EncodeUtils->>EncodeUtils: Lookup HEX_TABLE and append formatted hex code
end
EncodeUtils-->>Client: Return escaped string
Poem
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (2)
⏰ Context from checks skipped due to timeout of 90000ms (1)
🔇 Additional comments (10)
✨ Finishing Touches
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
Coverage summary from CodacySee diff coverage on Codacy
Coverage variation details
Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch: Diff coverage details
Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified: See your quality gate settings Change summary preferencesCodacy stopped sending the deprecated coverage status on June 5th, 2024. Learn more |
Coverage summary from CodacySee diff coverage on Codacy
Coverage variation details
Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch: Diff coverage details
Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified: See your quality gate settings Change summary preferencesCodacy stopped sending the deprecated coverage status on June 5th, 2024. Learn more |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #12 +/- ##
==========================================
+ Coverage 96.61% 96.71% +0.09%
==========================================
Files 16 16
Lines 739 760 +21
==========================================
+ Hits 714 735 +21
Misses 25 25 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
src/qs_codec/utils/encode_utils.py (1)
30-37
: Consider making the safe set a class constant.The safe set is currently recreated on each method call. Moving it to a class constant would improve performance by avoiding repeated set creation.
class EncodeUtils: HEX_TABLE: t.Tuple[str, ...] = tuple(f"%{i:02X}" for i in range(256)) + # Safe code points for URL encoding + SAFE_POINTS: t.Set[int] = ( + set(range(0x30, 0x3A)) # 0-9 + | set(range(0x41, 0x5B)) # A-Z + | set(range(0x61, 0x7B)) # a-z + | {0x40, 0x2A, 0x5F, 0x2D, 0x2B, 0x2E, 0x2F} # @, *, _, -, +, ., / + ) + # Additional safe points for RFC1738 + RFC1738_SAFE_POINTS: t.Set[int] = SAFE_POINTS | {0x28, 0x29} # ( ) @classmethod def escape( cls, string: str, format: t.Optional[Format] = Format.RFC3986, ) -> str: - # Build a set of "safe" code points. - safe: t.Set[int] = set(range(0x30, 0x3A)) | set(range(0x41, 0x5B)) | set(range(0x61, 0x7B)) - safe |= {0x40, 0x2A, 0x5F, 0x2D, 0x2B, 0x2E, 0x2F} # @, *, _, -, +, ., / - - # For RFC1738, add the ASCII codes for ( and ) - if format == Format.RFC1738: - safe |= {0x28, 0x29} + safe = cls.RFC1738_SAFE_POINTS if format == Format.RFC1738 else cls.SAFE_POINTS
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/qs_codec/utils/encode_utils.py
(2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: Codacy Static Code Analysis
🔇 Additional comments (3)
src/qs_codec/utils/encode_utils.py (3)
17-17
: Great optimization of HEX_TABLE initialization!The new implementation using f-string formatting is more concise and efficient while maintaining the same functionality.
30-37
: Well-structured safe character definition!The use of set operations and explicit ranges makes the code more readable and maintainable.
40-51
: Clean and efficient character processing!The changes improve both readability and performance:
- Direct character iteration is more Pythonic
- Set membership testing is efficient
- Clear separation of encoding paths
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/qs_codec/utils/encode_utils.py
(4 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: Codacy Static Code Analysis
🔇 Additional comments (3)
src/qs_codec/utils/encode_utils.py (3)
17-17
: LGTM! Clean and efficient HEX_TABLE initialization.The simplified initialization using f-strings improves readability while maintaining functionality.
40-60
: LGTM! Well-structured escape method with efficient set-based lookups.The refactored code is more maintainable and performs better by:
- Using precomputed sets for efficient character validation
- Employing consistent string formatting
- Clearly separating the logic for different character types
149-163
: LGTM! Consistent variable naming.The changes maintain consistency with other methods by using 'buffer' instead of 'result'.
This pull request includes significant changes to the
encode_utils.py
file to improve the encoding utilities and refactor the code for better readability and maintainability. Additionally, it updates the unit tests to reflect these changes.Improvements to encoding utilities:
src/qs_codec/utils/encode_utils.py
: Refactored theescape
method to use predefined sets of safe characters, improving readability and maintainability.src/qs_codec/utils/encode_utils.py
: Introduced new methods_convert_value_to_string
,_encode_string
,_is_safe_char
, and_encode_char
to modularize the encoding logic, making the code easier to understand and extend.src/qs_codec/utils/encode_utils.py
: Modified theHEX_TABLE
initialization for simplicity and added several sets of safe characters for different encoding formats.Refactoring and renaming:
src/qs_codec/utils/encode_utils.py
: Renamedto_surrogates
to_to_surrogates
and updated its usage throughout the file to reflect this change. [1] [2]Updates to unit tests:
tests/unit/utils_test.py
: Updated tests to use type annotations and reflect changes in method names and signatures. Added new tests for the_is_safe_char
method to ensure correctness. [1] [2] [3]Summary by CodeRabbit
Refactor
Tests