-
-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parse enum descriptions and value descriptions #2208
Conversation
Currently introduces a syntax error in outputs with an unescaped backslash. |
Now also properly escapes all strings. I hope you don't mind the |
@eutro Hey! That's a great improvement! It was on my TODO list! |
parser/raylib_parser.c
Outdated
@@ -804,6 +826,44 @@ static char *TextReplace(char *text, const char *replace, const char *by) | |||
} | |||
*/ | |||
|
|||
// Like fprintf, but supports only %i and %S. %S prints an escaped JSON-like string. | |||
static void FPrintfEscapes(FILE *file, const char *format, ...) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer to avoid this function if possible and just escape them directly on fprintf(). What are the benefits of creating an additional function to do that? It also add an additional library dependency that I'd prefer to avoid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Escaping the string requires the insertion of the extra backslashes (or possibly other characters for XML escapes), so it can't be done in-place without awkward shifting of bytes that may overflow buffers. Since they're being output directly, that's unnecessary and we can just insert the extra characters as we output.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I don't understand the problem, using backslashes is the expected way to print those characters. About the buffers size, it's a specific parser for a specific file (raylib.h
), buffers could be sized for that with a secure margin.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To write the raw string of value:
foo \ bar \ baz
in JSON,
"foo \\ bar \\ baz"
must be written.
These characters must be inserted into the string at specific indeces, which are the extra backslashes.
Naively, this means shifting the rest of the buffer to the right by a byte every time. Alternatively, counting backslashes ahead of time means shifting each byte at most once. Both of these are slightly more complicated than just outputting the extra backslashes with this function, but not the end of the world.
Additionally, inserting the extra backslashes into the string is destructive, meaning the original strings would be modified. This would be undesirable if we want to output the unescaped strings again. Say, if the program is modified to be able to output JSON and XML without a restart, outputting the JSON would modify the strings before the XML gets output. Again, not the end of the world, but it may be an unintuitive bug.
An alternative function would be one that just outputs an escaped string, then we can split the printf
calls on those.
Which would be preferrable?
- Using
FPrintfEscapes
(current) - A
PrintEscaped
function that prints a single escaped string - An
EscapeString
function that escapes a string in-place and destructively - Something else
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry again but I can't see the issue. As per my understanding foo
, bar
and baz
are independently provided so they can just be properly concatenated with required backslashes when writing the required output. Also XML, JSON and others are generated in an independent way, each one with its formatting requirements, directly generated from tokenized input data.
Please, could you provide a specific examples from raylib.h
parsing where that problem happens? Sorry again for not seeing the issue, english is not my first language and I may be missing the point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I want to output the description from this comment here
Line 567 in 6342cf1
KEY_BACKSLASH = 92, // Key: '\' |
then the string has the raw value of
Key: '\'
. In JSON, I should output this as "Key: '\\'"
, though, which is a character longer and requires the shifting of the \'
to the right by a character.
This is how this would be done:
Unescaped string from description:
K e y : ' \ '
If we want to modify the string in-place to have the escapes we must shift everything after the backslash to the right:
K e y : ' _ \ '
and then we can insert the extra backslash:
K e y : ' \ \ '
That's all, that's the issue I'm talking about.
What is currently being done (in this PR) is that we iterate over the string once, but output an extra backslash before anything that needs to be escaped:
K e y : ' \ '
^ output K
K e y : ' \ '
^ output e
...
K e y : ' \ '
^ output '
K e y : ' \ '
^ output two \s
K e y : ' \ '
^ output '
So, which should the approach be? Should we escape the string in-place (shifting the string and inserting backslashes, as in the first description) or should we just output an extra backslash as we output the string manually (like the second description)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, now I understand the issue. Checking raylib.h
there are only 4 cases where that's a problem (always descriptions):
Line 567: KEY_BACKSLASH = 92, // Key: '\'
Line 1037: RLAPI char *LoadFileText(const char *fileName); // Load text data from file (read), returns a '\0' terminated string
Line 1039: RLAPI bool SaveFileText(const char *fileName, char *text); // Save text data to file (write), string must be '\0' terminated, returns true on success
Line 1355: RLAPI unsigned int TextLength(const char *text); // Get text length, checks for '\0' ending
Current solution just replaces it by an empty space:
CharReplace(funcs[i].desc, '\\', ' ')
That function could be improved to replace the char by another string, just using an internal static buffer.
By the way, thank you very much for the detailed explanation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've replaced it with an EscapeBackslashes
function now. A general TextReplace
function would be overkill, I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, it looks good to me.
@eutro Thanks for the improvement! I'm merging and reviewing this PR. |
This PR updates the API parser to parse the descriptions of enums and enum values. These were marked with TODOs before. Enum value descriptions especially have quite a lot of content that was missing from the parses.