Add filters for htmlEscape and htmlUnescape #1061

ad8lmondy · 2022-12-06T04:16:04Z

This PR adds the filters htmlEncode and htmlDecode, inspired by the urlEncode and urlDecode filters.

This allows content like: "a > b && a < c" to be encoded to "a > b && a < c" and back again.

I've added the filter to existing tests, and all are passing.

Regarding the info here: #1060 (comment) (thanks for that, btw!)

Smaller PRs: I've split the PR into 4 commits of relatively logical blocks, but I'm not entirely sure if it's appropriate to make each a PR. I'm not sure the tests would pass without some of the code, so a certain size is going to be required, afaik. Please let me know if you'd prefer something else, happy to do as recommended.
Adding a new crate: This PR adds in a new crate html-escape. I took inspiration from the existing code for the urlEncode filters, which uses the percent-encoding crate. I don't say this as justification for adding another crate, just that I'm not sure my skills in rust are up to task of implementing it myself 😅

Thanks!

jcamiel · 2022-12-06T07:05:10Z

Thanks for the PR @ad8lmondy

regarding using an external crates: fair enough, I didn't check closely what we did on urlEncode. I think we're still going to remove this particular crate (need to check with @fabricereix) because, if I'm not missing anything, that's not a particular difficult task.
For instance, in the Tera code, the following function escapes HTML:

pub fn escape_html(input: &str) -> String {
    let mut output = String::with_capacity(input.len() * 2);
    for c in input.chars() {
        match c {
            '&' => output.push_str("&amp;"),
            '<' => output.push_str("&lt;"),
            '>' => output.push_str("&gt;"),
            '"' => output.push_str("&quot;"),
            '\'' => output.push_str("&#x27;"),
            '/' => output.push_str("&#x2F;"),
            _ => output.push(c),
        }
    }

    // Not using shrink_to_fit() on purpose
    output
}

It's small code and easy to understand. I much prefer having this code in our own source code than using an (even small) dependency. Unescaping is a "bit more" difficult but it is also not a "hard" tasks: for instance, HTML escape and unescape in Python standard lib. That's say, we can merge this PR and implement escaping / unescaping latter,

regarding the naming. I think I prefer htmlEscape and htmlUnescape. I understand this is not symetric with urlEncode / urlDecode but, for instance, in Python you have urlEncode and htmlEscape/Unescape. The Tera function is also named escape_html. What do you think? Did you name it because of the existing functions urlEncode and urlDecode or another reasons?

ad8lmondy · 2022-12-06T10:44:57Z

Yes, I suppose you're right about the difficulty - not too bad, particularly for the encoding side. In saying that, abstracting any potential nuances to a seperate crate is appealing too 😄

In regard to the naming - I agree, it should htmlEscape and htmlUnescape. I think I named it as encode/decode because I was literally copying chunks of code. Only when I found the html-escape crate did I realise it was a better name. I meant to go back and rename it, but forgot.

I'll update the PR tomorrow with the new and better name.

jcamiel · 2022-12-06T14:33:06Z

Great, just don't forget to rebase your commits to appeal our CI and everything should be ok!

ad8lmondy · 2022-12-06T21:12:25Z

Filter names updated, all rebased, and quick test from my end looks to be working.

These filters use the html_escape crate (inspired by the use of the percent_encoding crate) to escape and unescape html.

ad8lmondy · 2022-12-06T23:39:01Z

Oh yes, do let me know if there are any docs I should be updating. I wasn't sure if the grammar section on the website was generated or not - if not, I can update it 👍

jcamiel · 2022-12-07T01:33:37Z

/accept

hurl-bot · 2022-12-07T01:33:57Z

🕗 /accept is running, please wait for completion.

hurl-bot · 2022-12-07T01:38:04Z

✅ Pull request accepted and closed by jcamiel with fast forward merge..

# List of commits merged from ad8lmondy/hurl/htmldecode branch into Orange-OpenSource/hurl/master branch:

43487b0 Add integration tests for htmlEscape and htmlUnescape filters
e6028f3 Add support for htmlEscape and htmlUnescape in hurlfmt
db486f9 Add htmlEscape and htmlUnescape filters
7289930 Add grammar to support htmlEscape and htmlUnescape

jcamiel · 2022-12-07T01:38:48Z

Everything is good, once again thanks for the PR!

ad8lmondy force-pushed the htmldecode branch from d5d9e76 to 743207b Compare December 6, 2022 21:11

ad8lmondy force-pushed the htmldecode branch from 743207b to 1aee343 Compare December 6, 2022 21:15

ad8lmondy added 4 commits December 7, 2022 08:17

Add grammar to support htmlEscape and htmlUnescape

7289930

Add htmlEscape and htmlUnescape filters

db486f9

These filters use the html_escape crate (inspired by the use of the percent_encoding crate) to escape and unescape html.

Add support for htmlEscape and htmlUnescape in hurlfmt

e6028f3

Add integration tests for htmlEscape and htmlUnescape filters

43487b0

ad8lmondy force-pushed the htmldecode branch from 1aee343 to 43487b0 Compare December 6, 2022 21:17

github-actions bot merged commit 43487b0 into Orange-OpenSource:master Dec 7, 2022

fabricereix linked an issue Dec 7, 2022 that may be closed by this pull request

String manipulation and/or HTML Character Entities encode/decode support? #1038

Closed

fabricereix mentioned this pull request Dec 7, 2022

String manipulation and/or HTML Character Entities encode/decode support? #1038

Closed

jcamiel changed the title ~~Add filters for htmlEncode and htmlDecode~~ Add filters for htmlEscape and htmlUnescape Dec 7, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add filters for htmlEscape and htmlUnescape #1061

Add filters for htmlEscape and htmlUnescape #1061

ad8lmondy commented Dec 6, 2022

jcamiel commented Dec 6, 2022 •

edited

Loading

ad8lmondy commented Dec 6, 2022

jcamiel commented Dec 6, 2022

ad8lmondy commented Dec 6, 2022

ad8lmondy commented Dec 6, 2022

jcamiel commented Dec 7, 2022

hurl-bot commented Dec 7, 2022

hurl-bot commented Dec 7, 2022

jcamiel commented Dec 7, 2022

Add filters for htmlEscape and htmlUnescape #1061

Add filters for htmlEscape and htmlUnescape #1061

Conversation

ad8lmondy commented Dec 6, 2022

jcamiel commented Dec 6, 2022 • edited Loading

ad8lmondy commented Dec 6, 2022

jcamiel commented Dec 6, 2022

ad8lmondy commented Dec 6, 2022

ad8lmondy commented Dec 6, 2022

jcamiel commented Dec 7, 2022

hurl-bot commented Dec 7, 2022

hurl-bot commented Dec 7, 2022

jcamiel commented Dec 7, 2022

jcamiel commented Dec 6, 2022 •

edited

Loading