-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dev: add definitions.json generation script #772
base: main
Are you sure you want to change the base?
Conversation
WalkthroughThe changes include the addition of a new script, Changes
Poem
📜 Recent review detailsConfiguration used: .coderabbit.yaml 📒 Files selected for processing (1)
🔇 Additional comments (4)tools/generate_tx_models.py (4)
The source file for SFields has been updated to
The inclusion of
The regex pattern for extracting TxFormats has been updated to match the new structure in Run the following script to test the regex pattern against
The regex pattern for extracting SFields has been modified to accommodate the new format in Run the following script to test the regex pattern against Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🧹 Outside diff range and nitpick comments (5)
tools/generate_definitions.py (4)
6-15
: Add docstring to explain CAPITALIZATION_EXCEPTIONS dictionaryThe dictionary's purpose and when/how exceptions are applied should be documented for maintainability.
Add a docstring explaining the purpose:
CAPITALIZATION_EXCEPTIONS = { + """Mapping of special case strings to their proper capitalization in the XRPL. + + These exceptions override the default word capitalization rules when processing + field names and types from the rippled source code. + """ "NFTOKEN": "NFToken",
17-19
: Enhance error handling with descriptive messageThe error message could be more informative about what the rippled path should contain.
if len(sys.argv) != 2: - print("Usage: python " + sys.argv[0] + " path/to/rippled") + print(f"Usage: {sys.argv[0]} PATH_TO_RIPPLED\n" + f"PATH_TO_RIPPLED should point to the root of the rippled source code containing the 'include' directory") sys.exit(1)
87-93
: Document and simplify complex regex patternsThe regex patterns are complex and would benefit from documentation and named groups.
+# Pattern to match STYPE definitions in two possible formats: +# 1. STYPE(STI_NAME, NUMBER) +# 2. STI_NAME = NUMBER +TYPE_PATTERN = r""" + ^[ ]* # Start of line with optional spaces + (?:STYPE\(STI_ # First format: STYPE(STI_ + (?P<name1>[^ ]*?) # Capture name + [ ]*,[ ]* # Comma separator + (?P<num1>[0-9-]+) # Capture number + [ ]*\) # Closing parenthesis + | # OR + STI_ # Second format: STI_ + (?P<name2>[^ ]*?) # Capture name + [ ]*=[ ]* # Equals sign + (?P<num2>[0-9-]+) # Capture number + ) + [ ]*,?[ ]*$ # Optional comma and end of line +""" + type_hits = re.findall( - r"^ *STYPE\(STI_([^ ]*?) *, *([0-9-]+) *\) *\\?$", sfield_h, re.MULTILINE + TYPE_PATTERN, sfield_h, re.MULTILINE | re.VERBOSE )
1-326
: Add unit tests for the scriptThe script performs critical data processing but lacks tests to verify its correctness.
Would you like me to help create a test suite for this script? The tests would cover:
- File reading and error handling
- String translation logic
- Regex pattern matching
- Data validation
- JSON output formatting
xrpl/core/binarycodec/definitions/definitions.json (1)
Line range hint
1-1068
: Well-structured definitions file with clear organizationThe file maintains a clean and consistent structure with:
- Clear separation of concerns between different type definitions
- Consistent formatting and indentation
- Valid JSON syntax
- Logical grouping of related entries
This organization makes the file easy to maintain and extend.
Consider adding a schema file to formally validate the structure and prevent accidental malformation during future updates.
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
📒 Files selected for processing (2)
tools/generate_definitions.py
(1 hunks)xrpl/core/binarycodec/definitions/definitions.json
(1 hunks)
🔇 Additional comments (1)
xrpl/core/binarycodec/definitions/definitions.json (1)
33-57
: LGTM: LEDGER_ENTRY_TYPES changes are well-structured
The new ledger entry types are properly organized with:
- Consistent numerical ordering of type IDs
- No duplicate IDs
- Clear semantic naming that reflects their purpose
return "false" | ||
|
||
|
||
def _is_serialized(t: str) -> str: | ||
if t == "LEDGERENTRY" or t == "TRANSACTION" or t == "VALIDATION" or t == "METADATA": | ||
return "false" | ||
return "true" | ||
|
||
|
||
def _is_signing_field(t: str, not_signing_field: str) -> str: | ||
if not_signing_field == "notSigning": | ||
return "false" | ||
if t == "LEDGERENTRY" or t == "TRANSACTION" or t == "VALIDATION" or t == "METADATA": | ||
return "false" | ||
return "true" | ||
|
||
|
||
# Parse SField.cpp for all the SFields and their serialization info | ||
sfield_hits = re.findall( | ||
r"^ *[A-Z]*TYPED_SFIELD *\( *sf([^,\n]*),[ \n]*([^, \n]+)[ \n]*,[ \n]*" | ||
r"([0-9]+)(,.*?(notSigning))?", | ||
sfield_macro_file, | ||
re.MULTILINE, | ||
) | ||
for x in range(len(sfield_hits)): | ||
print(" [") | ||
print(' "' + sfield_hits[x][0] + '",') | ||
print(" {") | ||
print(' "nth": ' + sfield_hits[x][2] + ",") | ||
print(' "isVLEncoded": ' + _is_vl_encoded(sfield_hits[x][1]) + ",") | ||
print(' "isSerialized": ' + _is_serialized(sfield_hits[x][1]) + ",") | ||
print( | ||
' "isSigningField": ' | ||
+ _is_signing_field(sfield_hits[x][1], sfield_hits[x][4]) | ||
+ "," | ||
) | ||
print(' "type": "' + _translate(sfield_hits[x][1]) + '"') | ||
print(" }") | ||
print(" ]" + ("," if x < len(sfield_hits) - 1 else "")) | ||
|
||
print(" ],") | ||
|
||
######################################################################## | ||
# TER code processing | ||
######################################################################## | ||
print(' "TRANSACTION_RESULTS": {') | ||
ter_h = str(ter_h).replace("[[maybe_unused]]", "") | ||
|
||
ter_code_hits = re.findall( | ||
r"^ *((tel|tem|tef|ter|tes|tec)[A-Z_]+)( *= *([0-9-]+))? *,? *(\/\/[^\n]*)?$", | ||
ter_h, | ||
re.MULTILINE, | ||
) | ||
upto = -1 | ||
last = "" | ||
|
||
for x in range(len(ter_code_hits)): | ||
if ter_code_hits[x][3] != "": | ||
upto = int(ter_code_hits[x][3]) | ||
|
||
current = ter_code_hits[x][1] | ||
if current != last and last != "": | ||
print("") | ||
pass | ||
last = current | ||
|
||
print( | ||
' "' | ||
+ ter_code_hits[x][0] | ||
+ '": ' | ||
+ str(upto) | ||
+ ("," if x < len(ter_code_hits) - 1 else "") | ||
) | ||
|
||
upto += 1 | ||
|
||
print(" },") | ||
|
||
######################################################################## | ||
# Transaction type processing | ||
######################################################################## | ||
print(' "TRANSACTION_TYPES": {') | ||
print(' "Invalid": -1,') | ||
|
||
tx_hits = re.findall( | ||
r"^ *TRANSACTION\(tt[A-Z_]+ *,* ([0-9]+) *, *([A-Za-z]+).*$", | ||
transactions_file, | ||
re.MULTILINE, | ||
) | ||
for x in range(len(tx_hits)): | ||
print( | ||
' "' | ||
+ tx_hits[x][1] | ||
+ '": ' | ||
+ tx_hits[x][0] | ||
+ ("," if x < len(tx_hits) - 1 else "") | ||
) | ||
|
||
print(" }") | ||
print("}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add logging and validation throughout the script
The script would benefit from proper logging and validation of parsed data.
Add logging and validation:
import re
import sys
+import logging
+from typing import List, Tuple, Dict
+
+# Configure logging
+logging.basicConfig(
+ level=logging.INFO,
+ format='%(asctime)s - %(levelname)s - %(message)s'
+)
+
+def validate_parsed_data(section_name: str, data: List[Tuple[str, ...]]) -> None:
+ """Validate parsed data for completeness and correctness.
+
+ Args:
+ section_name: Name of the section being validated
+ data: List of parsed tuples
+
+ Raises:
+ SystemExit: If validation fails
+ """
+ if not data:
+ logging.error(f"No data parsed for section: {section_name}")
+ sys.exit(1)
+ logging.info(f"Successfully parsed {len(data)} entries for {section_name}")
Then use throughout the script:
type_hits = re.findall(...)
+validate_parsed_data("TYPES", type_hits)
+logging.info("Processing type definitions")
Committable suggestion skipped: line range outside the PR's diff.
def _read_file(filename: str) -> str: | ||
with open(filename, "r") as f: | ||
return f.read() | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add error handling for file operations
The file reading function should handle potential IOErrors and provide meaningful error messages.
def _read_file(filename: str) -> str:
+ """Read and return the contents of a file.
+
+ Args:
+ filename: Path to the file to read
+
+ Returns:
+ The contents of the file as a string
+
+ Raises:
+ SystemExit: If the file cannot be read
+ """
try:
with open(filename, "r") as f:
return f.read()
+ except IOError as e:
+ print(f"Error reading {filename}: {e}", file=sys.stderr)
+ sys.exit(1)
Committable suggestion skipped: line range outside the PR's diff.
def _translate(inp: str) -> str: | ||
if re.match(r"^UINT", inp): | ||
if re.search(r"256|160|128", inp): | ||
return inp.replace("UINT", "Hash") | ||
else: | ||
return inp.replace("UINT", "UInt") | ||
if inp == "OBJECT" or inp == "ARRAY": | ||
return "ST" + inp[0:1].upper() + inp[1:].lower() | ||
if inp == "ACCOUNT": | ||
return "AccountID" | ||
if inp == "LEDGERENTRY": | ||
return "LedgerEntry" | ||
if inp == "NOTPRESENT": | ||
return "NotPresent" | ||
if inp == "PATHSET": | ||
return "PathSet" | ||
if inp == "VL": | ||
return "Blob" | ||
if inp == "DIR_NODE": | ||
return "DirectoryNode" | ||
if inp == "PAYCHAN": | ||
return "PayChannel" | ||
|
||
parts = inp.split("_") | ||
result = "" | ||
for part in parts: | ||
if part in CAPITALIZATION_EXCEPTIONS: | ||
result += CAPITALIZATION_EXCEPTIONS[part] | ||
else: | ||
result += part[0:1].upper() + part[1:].lower() | ||
return result | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Improve _translate function with documentation and type hints
The translation function is complex and would benefit from better documentation and structure.
-def _translate(inp: str) -> str:
+def _translate(inp: str) -> str:
+ """Translate rippled type names to binary codec format.
+
+ Args:
+ inp: Input string in rippled format (e.g., 'UINT256', 'ACCOUNT')
+
+ Returns:
+ Translated string in binary codec format (e.g., 'Hash256', 'AccountID')
+ """
+ # Define translation mapping for direct conversions
+ DIRECT_TRANSLATIONS = {
+ "OBJECT": "STObject",
+ "ARRAY": "STArray",
+ "ACCOUNT": "AccountID",
+ "LEDGERENTRY": "LedgerEntry",
+ "NOTPRESENT": "NotPresent",
+ "PATHSET": "PathSet",
+ "VL": "Blob",
+ "DIR_NODE": "DirectoryNode",
+ "PAYCHAN": "PayChannel",
+ }
+
+ # Handle direct translations first
+ if inp in DIRECT_TRANSLATIONS:
+ return DIRECT_TRANSLATIONS[inp]
+
# Handle UINT types
if re.match(r"^UINT", inp):
if re.search(r"256|160|128", inp):
return inp.replace("UINT", "Hash")
else:
return inp.replace("UINT", "UInt")
- if inp == "OBJECT" or inp == "ARRAY":
- return "ST" + inp[0:1].upper() + inp[1:].lower()
- if inp == "ACCOUNT":
- return "AccountID"
- if inp == "LEDGERENTRY":
- return "LedgerEntry"
- # ... rest of the conditions
Committable suggestion skipped: line range outside the PR's diff.
sfield_h_fn = sys.argv[1] + "/include/xrpl/protocol/SField.h" | ||
sfield_macro_fn = sys.argv[1] + "/include/xrpl/protocol/detail/sfields.macro" | ||
ledger_entries_macro_fn = ( | ||
sys.argv[1] + "/include/xrpl/protocol/detail/ledger_entries.macro" | ||
) | ||
ter_h_fn = sys.argv[1] + "/include/xrpl/protocol/TER.h" | ||
transactions_macro_fn = sys.argv[1] + "/include/xrpl/protocol/detail/transactions.macro" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Use pathlib for robust path handling
Using string concatenation for paths is error-prone. Consider using pathlib for more robust path handling.
+from pathlib import Path
+
+rippled_path = Path(sys.argv[1])
+include_path = rippled_path / "include" / "xrpl" / "protocol"
+
-sfield_h_fn = sys.argv[1] + "/include/xrpl/protocol/SField.h"
-sfield_macro_fn = sys.argv[1] + "/include/xrpl/protocol/detail/sfields.macro"
+sfield_h_fn = include_path / "SField.h"
+sfield_macro_fn = include_path / "detail" / "sfields.macro"
Committable suggestion skipped: line range outside the PR's diff.
High Level Overview of Change
This PR adds a script to generate the
definitions.json
file from rippled source code.Context of Change
Copied (and modified) from https://github.com/RichardAH/xrpl-codec-gen. It makes more sense to store this script in the library repo now.
Type of Change
Did you update HISTORY.md?
Test Plan
Works locally.