Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(4.0.x) Response encoding not stored in the ext context session (possible spec violation?) #5543

Closed
TomasHofman opened this issue Jan 16, 2025 · 9 comments
Milestone

Comments

@TomasHofman
Copy link
Contributor

TomasHofman commented Jan 16, 2025

Describe the bug

Since Mojarra 4.0.6 (probably after this PR #5385), Mojarra doesn't store the response encoding into the external context session. Originally, this was performed by this line: https://github.com/eclipse-ee4j/mojarra/pull/5385/files#diff-b3a9aacd3be9aefe5ab7fe6e92c87862f999bff1c8523313a2c76451f5136422L1010

I think storing the response encoding is mandated by the spec "2.5.2.2. Determining the Character Encoding" (https://jakarta.ee/specifications/faces/4.0/jakarta-faces-4.0#determining-the-character-encoding), which mentions:

At the end of the render-response phase, the ViewHandler must store the response character encoding used by the underlying response object (e.g., the Jakarta Servlet or Portlet response) in the session (if and only if a session already exists) under a well known, implementation-dependent key.

The result is that accented characters may not be correctly processed by Mojarra. It comes to a situation that in a scope of single session, even if previous responses were being generated and marked as UTF-8, subsequent request may still not be processed as UTF-8 and the request contents are mangled.

To Reproduce

Steps to reproduce the behavior:

  1. Checkout & build reproducer project https://github.com/jbaesner/mojarra-response-encoding-issue/tree/main.
  2. Download & unzip Wildfly 35 (https://github.com/wildfly/wildfly/releases/download/35.0.0.Final/wildfly-35.0.0.Final.zip).
  3. Deploy the reproducer war (copy to wildlfy/standalone/deployments/)
  4. Start Wildfly (./wildfly/bin/standalone.sh)
  5. Open http://localhost:8080/mojarra-response-encoding-issue, you will see a form with an accented character "ä" in an input field.
  6. Click the button to send the form.
  7. The character in the input field gets mangled.

Expected behavior

The accented character is displayed correctly after form is submitted.

Screenshots

Image

Image

Desktop (please complete the following information):

  • OS: Fedora 35
  • Browser: Firefox
  • Version: 134
@TomasHofman
Copy link
Contributor Author

I created a proposal PR here: #5542

@schluedo
Copy link

https://www.w3schools.com/tags/ref_urlencode.asp :
"Your browser will encode input, according to the character-set used in your page."
-> I would like to suggest a small but useful change:
Store the encoding in the ViewState, so the encoding is an "attribute" of the rendered Page/View. This fits the JSF-state-on-server-paradigm perfectly and also matches the JSF-spec "store it in the session" (but once per view).
This also avoids concurrency issues if you have multiple tabs, with different encodings.

having a look at:
<f:view locale="xy" encoding="UTF-8"> ...
the encoding already looks like an attribute of the view but it does not behave like it.

@TomasHofman
Copy link
Contributor Author

@schluedo thinking about this more, I'm wondering though what practical effect this suggestion would have.

As the traveling data are URL-encoded, the encoding used by the client becomes irrelevant. (URL-encode should be like "encode to UTF-8 and then escape reserved characters", so the result should be the same no matter what the original encoding was. That is why the "application/x-www-form-urlencoded" format doesn't have a "charset" parameter.)

Also I think that modern browsers don't choose the encoding arbitrarily or based on user locale. They default to UTF-8, unless the web page specifies something different. If you had different tabs using different encodings, the application would have to cause it itself.

Do you see such a problem in practice?

@schluedo
Copy link

@TomasHofman as far as I read it UTF-8 is a recommendation for URL-Encoding, but not a standard.
I did the test: if I change to:
<f:view encoding="ISO-8859-1">
the browser sends other value: "%E4" (ISO-8859-1) instead of "%C3%A4" (UTF-8)
-> So the browser uses the (initial: default to utf-8) encoding from the server -> the server must store the encoding used (ideally: per view)

I could image a scenario where applications provide different encodings in different views (e.g. different charsets for different user locales: /indexDE.xhtml and /indexCN.xhtml)

BalusC added a commit that referenced this issue Jan 18, 2025
@BalusC
Copy link
Contributor

BalusC commented Jan 18, 2025

Your reproducer is actually a no-repro in GlassFish 7 nor Tomcat 10. It appears this is because these servers by default use UTF-8 as request body encoding while WildFly by default uses ISO-8859-1.

But you're right, this slipped through during the #5385 work.

I've fixed it.

Thank you for reporting the issue and the PR attempt!

@schluedo
Copy link

@BalusC what do you think about the encoding-per-view approach?

@TomasHofman
Copy link
Contributor Author

@BalusC yes, I can confirm that setting the default encoding to UTF-8 is a working workaround in Wildfly. Thanks for fixing this the right way! :)

@TomasHofman
Copy link
Contributor Author

@BalusC I guess it should also be up-ported to the 4.1 and main branches..?

BalusC added a commit that referenced this issue Feb 1, 2025
Fix Response encoding not stored in the ext context session #5543
BalusC added a commit that referenced this issue Feb 1, 2025
@BalusC BalusC closed this as completed in a30bdf8 Feb 1, 2025
BalusC added a commit that referenced this issue Feb 1, 2025
BalusC added a commit to jakartaee/faces that referenced this issue Feb 1, 2025
@BalusC
Copy link
Contributor

BalusC commented Feb 1, 2025

@BalusC I guess it should also be up-ported to the 4.1 and main branches..?

Was just matter of upmerging.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants