Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XML.toString - Escape nightmare #285

Closed
eduveks opened this issue Sep 18, 2016 · 11 comments
Closed

XML.toString - Escape nightmare #285

eduveks opened this issue Sep 18, 2016 · 11 comments

Comments

@eduveks
Copy link

eduveks commented Sep 18, 2016

The XML.escape is very basic and when working with latin accents I can not reach a good result!

Will no escape latin accents and if I escape before then the result will be ã (madness)

I suggest to use: org.apache.commons.lang3.StringEscapeUtils.escapeHtml4

Is the better escape solution, works like a charm with all characters in my experience.

Or adds these methods to avoid the default escape:

  • XML.toString(Object object, String tagName, boolean escape)
  • XML.toString(Object object, boolean escape)

Thank you!

@stleary
Copy link
Owner

stleary commented Sep 18, 2016

Can you provide a code and JSON text fragment that demonstrates the problem, and what you would prefer to see in the XML.toString() output?

There are no plans now, nor are any expected in the future, where 3rd party libs will be included in the project. See https://github.com/stleary/JSON-java/wiki/FAQ

@eduveks
Copy link
Author

eduveks commented Sep 18, 2016

Ok! I understand your position to don't use external libs. Then please make a solution to do not escape the &.

A little sample:

try {
    JSONObject json = new JSONObject("{ \"amount\": \"10,00 €\", \"description\": \"Ação Válida\" }");
    System.out.println(XML.toString(json));
} catch (JSONException e) {
    e.printStackTrace();
}

Output:

<amount>10,00 €</amount><description>Ação Válida</description>

But... sending this XML output to another service will be converted to:

<amount>10,00 􀀁</amount><description>A􀀁􀀁o V􀀁lida</description>

Because the special characters need be escape!

But has a lot of special characters in so many languages like portuguese, french, spanish, arabic, chinese, german... and the special characters with accents need be escaped correctly to ensure that do not miss the encoding.

If pre escape the JSON:

try {
    JSONObject json = new JSONObject("{ \"amount\": \"10,00 &euro;\", \"description\": \"A&ccedil;&atilde;o V&aacute;lida\" }");
    System.out.println(XML.toString(json));
} catch (JSONException e) {
    e.printStackTrace();
}

The output is:

<amount>10,00 &amp;euro;</amount><description>A&amp;ccedil;&amp;atilde;o V&amp;aacute;lida</description>

And the correct output should be:

<amount>10,00 &euro;</amount><description>A&ccedil;&atilde;o V&aacute;lida</description>

No way to make a workaround because the & is always converted to &amp; in XML.toString

@johnjaylward
Copy link
Contributor

I think XML and json are both supposed to support Unicode properly. Maybe
check that the encoding is properly set in your XML header as utf8 instead
of ASCII or some other encoding.

On Sep 18, 2016 16:00, "Eduardo Velasques" notifications@github.com wrote:

Ok! I understand your position to don't use external libs. Then please
make a solution to do not escape the &.

A little sample:

try {
JSONObject json = new JSONObject("{ "amount": "10,00 €", "description": "Ação Válida" }");
System.out.println(XML.toString(json));
} catch (JSONException e) {
e.printStackTrace();
}

Output:

10,00 €Ação Válida

But... sending this XML output to another service will be converted to:

10,00 􀀁A􀀁􀀁o V􀀁lida

Because the special characters need be escape!

But has a lot of special characters in so many languages like portuguese,
french, spanish, arabic, chinese, german... and the special characters with
accents need be escaped correctly to ensure that do not miss the encoding.

If pre escape the JSON:

try {
JSONObject json = new JSONObject("{ "amount": "10,00 €", "description": "Ação Válida" }");
System.out.println(XML.toString(json));
} catch (JSONException e) {
e.printStackTrace();
}

The output is:

10,00 &euro;A&ccedil;&atilde;o V&aacute;lida

And the correctly is:

10,00 €Ação Válida

No way to make a workaround because the & is always converted to & in
XML.toString


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#285 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAXa1y0cjSdFtVQWNVmZinDX5SkuFryPks5qrZhygaJpZM4J_75m
.

@eduveks
Copy link
Author

eduveks commented Sep 18, 2016

johnjaylward, is not true! When you sends your bytes throgh web services and reverse proxies, and information passing in 3 or 4 web servers and where in each one converts to string, many things can happen...

I just need to control the escape because the default escape is very basic. Just it! A flag to not escape.

@johnjaylward
Copy link
Contributor

I think the right solution would probably be to place the correct Unicode
in the JSONObject, but then have the XML toString escape Unicode chars as
hex escapes similar to what happens in JSONObject.toString

On Sep 18, 2016 17:53, "Eduardo Velasques" notifications@github.com wrote:

johnjaylward, is not true! When you sends your bytes throgh web services
and reverse proxies, and information passing in 3 or 4 web servers and
where in each one converts to string, many things can happen...

I just need to control the escape because the default escape is very
basic. Just it! A flag to not escape.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#285 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAXa16dbaJwKoWVKMQv6mFhUnW5if2mlks5qrbLTgaJpZM4J_75m
.

@eduveks
Copy link
Author

eduveks commented Sep 18, 2016

Like this?

try {
    JSONObject json = new JSONObject("{\"amount\": \"10,00 \\u20AC\", \"description\": \"A\\u00E7\\u00E3o V\\u00E1lida\"}");
    System.out.println(XML.toString(json));
} catch (JSONException e) {
    e.printStackTrace();
}

The final result must be:

<amount>10,00 &#x20AC;</amount><description>A&#xE7;&#xE3;o V&#xE1;lida</description>

Sounds perfect! But I think will be hard to accomplish.

@johnjaylward
Copy link
Contributor

Yes, that was what I was picturing.

On Sep 18, 2016 18:11, "Eduardo Velasques" notifications@github.com wrote:

Like this?

try {
JSONObject json = new JSONObject("{"amount": "10,00 \u20AC", "description": "A\u00E7\u00E3o V\u00E1lida"}");
System.out.println(XML.toString(json));
} catch (JSONException e) {
e.printStackTrace();
}

Sounds perfect! But I think will be hard to accomplish.

10,00 €Ação Válida


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#285 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAXa17puWftiUEaXvtVcxUTnYrpbZ6fdks5qrbccgaJpZM4J_75m
.

@stleary stleary closed this as completed Sep 22, 2016
@eduveks
Copy link
Author

eduveks commented Sep 22, 2016

What means closed, will not be done?

Just to I know to looking for another solution...

@stleary
Copy link
Owner

stleary commented Sep 22, 2016

No objection if you want to create a pull request that solves your problem without causing problems to existing applications.

@eduveks
Copy link
Author

eduveks commented Sep 22, 2016

Ok! ;)

@johnjaylward
Copy link
Contributor

I believe I have a PR that will fix this. Just writing up the documentation on it now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants