Add support for non-ASCII header values #26

mmanes · 2024-06-06T23:16:38Z

Add support for non-ASCII header values.
Update deps.
Switch from 'master' branch to 'main'

Fixes:

account for non ASCII values in the headers or preamble #25

Switch master branch to main

…a String Update value character validator Add test

robotdan · 2024-06-07T00:39:55Z

src/main/java/io/fusionauth/http/util/HTTPTools.java

  public static boolean isValueCharacter(byte ch) {
-    return isURICharacter(ch) || ch == ' ' || ch == '\t' || ch == '\n';
+    int intVal = ch & 0xFF;  // Convert the value into an integer without extending the sign bit.
+    return isURICharacter(ch) || intVal == ' ' || intVal == '\t' || intVal == '\n' || intVal >= 0x80;


In addition to header values, this method is used when parsing the response status message. Does the RFC allow for this type of character there as well? Or is there any risk to using this same logic for both of these use cases?

That part hasn't been superseded by RFC9110, so according to RFC7230, the status reason can still (unfortunately) contain obs-text.

The reason-phrase element exists for the sole purpose of providing a
textual description associated with the numeric status code, mostly
out of deference to earlier Internet application protocols that were
more frequently used with interactive text clients. A client SHOULD
ignore the reason-phrase content.

reason-phrase = *( HTAB / SP / VCHAR / obs-text )

robotdan · 2024-06-07T00:45:23Z

src/main/java/io/fusionauth/http/server/HTTPRequestProcessor.java

-  // TODO : Should this be sized with a configuration parameter?
-  private final StringBuilder builder = new StringBuilder();
+  // Allocate a 4k buffer for starters, it will grow as needed.
+  private final ByteArrayOutputStream byteBuffer = new ByteArrayOutputStream(4096);


I suppose this should be ok? We will be allocating at least 4k on each request - this is quite a bit more than the previous StringBuilder - but I suppose that was likely having to re-allocate the underlying array right away.

Perhaps it is very common for the HTTP request preamble to exceed 4k? Any performance concern on this change?

I presume the previous approach did a lot of re-allocations and copying while it was parsing the prelude. This feels large enough to not need re-allocation in most cases, while not going overboard.

mmanes added 7 commits June 5, 2024 10:38

Update deps and bump version

c0b92ab

Switch master branch to main

Switch to arm builder for test runs

17ed5a1

Fix action arch

a7a3a0d

Revert back to x86 runners

3a50dce

Modify HTTPRequestParser to use a a byte buffer before converting to …

dfa0134

…a String Update value character validator Add test

Update test, format

4531f81

Update tests for https

fe15c58

mmanes marked this pull request as ready for review June 7, 2024 00:26

Update copyright

4ea1105

robotdan approved these changes Jun 7, 2024

View reviewed changes

mmanes merged commit 08f9b71 into main Jun 7, 2024

mmanes deleted the mmanes/header-fix branch June 7, 2024 17:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add support for non-ASCII header values #26

Add support for non-ASCII header values #26

Uh oh!

mmanes commented Jun 6, 2024 •

edited

Loading

Uh oh!

robotdan Jun 7, 2024

Uh oh!

mmanes Jun 7, 2024

Uh oh!

robotdan Jun 7, 2024

Uh oh!

mmanes Jun 7, 2024

Uh oh!

Uh oh!

Add support for non-ASCII header values #26

Add support for non-ASCII header values #26

Uh oh!

Conversation

mmanes commented Jun 6, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

robotdan Jun 7, 2024

Choose a reason for hiding this comment

Uh oh!

mmanes Jun 7, 2024

Choose a reason for hiding this comment

Uh oh!

robotdan Jun 7, 2024

Choose a reason for hiding this comment

Uh oh!

mmanes Jun 7, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mmanes commented Jun 6, 2024 •

edited

Loading