-
-
Notifications
You must be signed in to change notification settings - Fork 21.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add String.strip_bbcode()
and String.bbcode_escape()
BBCode methods
#78310
base: master
Are you sure you want to change the base?
Conversation
@@ -430,6 +431,7 @@ class String { | |||
String c_escape_multiline() const; | |||
String c_unescape() const; | |||
String json_escape() const; | |||
String bbcode_escape() const; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
String bbcode_escape() const; | |
String escape_bbcode() const; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity, why rename this method? We have json_escape()
above already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, I didn't notice that. Just verb + noun looks more standard. Given c_escape()
and json_escape()
, it is logical to name this method bbcode_escape()
. However, should the second one then be named bbcode_strip()
to be consistent with the first, or strip_bbcode()
to be consistent with strip_edges()
and strip_escapes()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer keeping strip_bbcode()
as well to be consistent with strip_edges()
and strip_escapes()
.
@Calinou Just tested and using the same format as your example it seems like |
@markdibarry See godotengine/godot-proposals#5056 (comment):
It removes all tags, whether they exist or not. It's a |
@dalexeev Ah, I see now. I think it would be misleading to call it |
I think it is fine as is. |
I disagree, but you're the decision makers. Just pointing out that this method conveys that it removes BBCode, but instead, it removes BBCode, BBCode-similar tags, and any content that happens to be within brackets. I, and two other devs I know who have established dialog languages for Godot, use tags for our language and also support BBCode. The problem, as we've discussed, is that in order to assign text, the current workflow is:
This interested me (and I'm sure others), because I assumed it actually stripped BBCode, preventing us from having to assign the text field twice. It doesn't, so it's not relevant to our cases, but since it also removes all tags, even if they're not in BBCode syntax format, I don't see how it'd would be reliable to use in any scenario. |
Currently would heavily appreciate these methods, as it would prevent users from writing BBCode in a chat system I'm currently experimenting with. Fortunately, writing a workaround isn't exactly difficult, but it would be nice. |
aa9f05c
to
b216ff1
Compare
@Calinou Any chance you can name it |
We already use It isn't technically 100% correct, but just because there's no true BBCode standard doesn't mean we can't call it BBCode. It's just like INI 🙂 |
@Calinou I know, I know. It sounds totally pedantic, I just knew I'd regret it if I didn't throw my reasoning out there before it's too late, so I want to be annoying and express my concerns while I still can: You're absolutely right that there's no official standard for BBCode. Almost every program out there that uses the term BBCode uses their own implementation/flavor, but there is a current established implementation that we use in Godot, and this method doesn't support it. I'm not invested in the word "tag" at all, it just seems like the closest word to what it actually does without being misleading, and I couldn't think of something better. It makes more sense in my head to use a generic word for this method that works in a generic way, rather than use a word that means something else specific in Godot, since that can cause a name conflict in the near future. More selfishly: My projects (and a few others I know) use dialog systems that get bogged down with heavy string manipulation required to strip If Godot's implementation of BBCode and the BBCode that this method supports are incompatible, then that raises a lot of questions. In the case of a method to strip Godot's implementation of BBCode was eventually exposed, what would happen with these methods? I'd think it'd be confusing for many to have two methods that take strings that are named the same thing, but do something different. Would we rename the |
If you want a specialized and reasonably performant workaround then I'd say just to use regex, as long as the documentation clarifies it isn't selective I don't see any reason to bog this down or rename it, naming it anything else would be confusing IMO, the only other name I'd say is non-confusing would be something like |
That's all a fair point to be made. Perhaps it is best to delegate this method to RichTextLabel itself to have more control over tags. However, there have also been occasions where the editor may need to strip generic BBCode from a text before displaying it, too, without creating a whole RichTextLabel for it. Perhaps there should be a static method for it, too... |
@markdibarry Formally, If you are dealing with user input, then you should use If we want to achieve more accurate (but more verbose) naming, then I would suggest the |
<method name="bbcode_escape" qualifiers="const"> | ||
<return type="String" /> | ||
<description> | ||
Returns the string with BBCode tags escaped to [code][lb][/code] and [code][rb][/code], which makes them ineffective when used in [RichTextLabel] with [member RichTextLabel.bbcode_enabled] set to [code]true[/code]. This is useful for handling user input to prevent BBCode injection. See also [method strip_bbcode]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Returns the string with BBCode tags escaped to [code][lb][/code] and [code][rb][/code], which makes them ineffective when used in [RichTextLabel] with [member RichTextLabel.bbcode_enabled] set to [code]true[/code]. This is useful for handling user input to prevent BBCode injection. See also [method strip_bbcode]. | |
Returns the string with BBCode tags escaped with [code][lb][/code], which makes them ineffective when used in [RichTextLabel] with [member RichTextLabel.bbcode_enabled] set to [code]true[/code]. This is useful for handling user input to prevent BBCode injection. See also [method strip_bbcode]. |
The same below.
I agree that there is good use for this method, and I don't want to hold it up too much, I just wanted to share my thoughts, since I can already see problems in the future, but only with name conflicts. Some background: In In most of the dialog systems for Godot (including mine), the built in dialog code is between brackets ( The only current avenue to get the text with valid Godot BBCode tags removed, is to assign it to the # in RichTextLabel node
text = fullText
var parsedText = get_parsed_text()
# for dialog parsing
var dialogTags = get_dialog_tags(fullText)
text = reconstruct_text(fullText, parsedText, dialogTags) This proposed method would remove valid Godot BBCode, valid dialog code, and any intentional non-code bracket surrounded text. So, of course, it can't be used for anything that needs awareness of Godot's implementation of BBCode, since it works on a very general level of anything that is between brackets. However, it'd be very helpful to someday have a static method that parses Godot's implementation of BBCode from a string without needing to update the |
Note that you are not required to use
As for
Thanks for your concerns about potential naming conflicts! I don't have a clear opinion on this issue other than the suggestion above. But I'm not sure if clarifying tags vs markup really helps. To be completely precise, the method should be named Also, see Finally, if we add the |
Just to reiterate, I have no need for the word "tag" to be used, it's just the closest general word I could think up that doesn't have a double meaning in the code base. I'm more on the "not bbcode" team than the "yes tag" team. 😂
I doubt this would be a widely used method, but I also totally agree about keeping verbosity to a minimum.
I also like to think my code is self-documenting, but I don't think most users know that I think if it's really important that they're named the same, some detailed documentation on the method's limitations may help alleviate some potential confusion among the community. I think at least just a note that says "Does not consider any specific BBCode implementation when parsing." or something similar would help.
I can't speak for the other dialog system maintainers, but I personally think it's definitely a double-edged sword! You're right that it does make it tricky to parse, but the benefits make it worth it. Using a unified syntax can be very helpful to users looking to adopt, without having them needing to memorize a ton out of the gate. Especially when those who are most likely to use the dialog system aren't programmers. Plus, the obvious, it wouldn't be a great solution to completely rewrite the system and make all users upgrade and relearn a new syntax.
Okay, this is something new to me! You've got me very interested! How do you access the internal tag list |
This is not relevant to this PR, but I made a gist (if you have any questions, please leave comments there, not in this PR). It is not possible to obtain the tag list via Godot API, see source code instead. However, you probably won't need to get the full list, you can only process custom pseudo-tags and individual tags, such as |
Ah that's a shame. Your gist looks like what I came up with a few years ago, but this would require constant maintenance, so it wouldn't be of any use. Thanks anyway! I appreciate you trying! Seeing as though I'm the minority on the naming ambiguity, as long as there's a small note in the method description about its limitations or intended use, I won't push it any further. It's enough to point users to properly read the docs if they're confused. 😅 |
Hey, is this going to be added? Would be useful to have a neat built-in function. |
b216ff1
to
949949b
Compare
4.3 is in feature freeze now, so this will probably not be merged in that version. That said, it should be straightforward to implement the functions from this PR in GDScript so you can use them in your project. |
`String.strip_bbcode()` can be used to display text from a RichTextLabel in places that don't support rich formatting. `String.bbcode_escape()` can be used to prevent BBCode injection in RichTextLabels that display user input.
949949b
to
4f0b32b
Compare
See this comment from the original proposal if you want a work-around in the meantime. |
With time thinking about it, I do think this method belongs inside RichTextLabel. Whether it should be static or directly tied to a RichTextLabel object (hence it'd be able to distinguish valid BBcode tags), I am not sure. My reasoning is also... Well, the entirety of String's method list has turned into a movie to scroll through. It's a relatively niche method, I'm not sure if it deserves to be there. To be honest, it's fine. I will still sleep at night if this PR is merged as is. |
Here are code examples of both implementations. RichTextLabel: # node reference
@onready var label = $RichTextLabel
var stripped_text = label.strip_bbcode("[center]Text Here")
# or, class reference
var stripped_text = RichTextLabel.strip_bbcode("[center]Text Here")
String: var stripped_text = "[center]Text Here".strip_bbcode()
Personally I prefer how the code reads when it is under the String class. It also functions like other immutable String methods, so the behavior is clearer from the outset. Implementing it under RichTextLabel would imply (to me at least) that the method would render or otherwise modify the text in some way, when all it does is return a modified String. |
String.strip_bbcode()
can be used to display text from a RichTextLabel in places that don't support rich formatting.String.bbcode_escape()
can be used to prevent BBCode injection in RichTextLabels that display user input.String.strip_bbcode()
implemented.Thanks @dalexeev for providing the
String.strip_bbcode()
implementation 🙂Preview
From top to bottom: unescaped, stripped, escaped