-
Notifications
You must be signed in to change notification settings - Fork 848
TS-4312: Adding config to parse urls according to RFC #541
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
I worked with @shenzhang920 on this and it looks good to me. 👍 from me. @zwoop , mind taking a look? |
|
This is actually how haproxy and nginx behave by default, this config option is disabled by default for backwards compatibility. |
proxy/hdrs/URL.cc
Outdated
| } | ||
|
|
||
| static char non_encoded_char[256]; | ||
| static bool url_init_non_encoded_char_array_done = url_init_non_encoded_char_array(non_encoded_char, sizeof(non_encoded_char)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a fixed initialization. Let's just hardcode it rather than generating it at startup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hardcoding would look like as below, it is error prone. Should I change it?
static const char non_encoded_char[256] = {
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, // ! # $ % & ' ( ) * + , - . /
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1, // 0-9 : ; = ?
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, // @ A-O
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1, // P-Z [ ] _
0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, // a-o
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 0 // p-z ~
// values of indices from 128 to 255 are all 0
};
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I see your point. I think that this properly belongs in ParseRules.cc, though all the parse bits are taken. Maybe we could reclaim something esoteric like is_wildmat_BIT?
|
@jpeach I looked into the idea of using ParseRules.cc and CompileParseRules with @shenzhang920 but it does't look like there are any free bits at the moment, should we just bump it up to 64 bit so we can add another bit for it? Thoughts? Regarding reclaiming an existing bit, I'm not sure I think we're better off just going to 64bits. |
|
I definitely think that |
|
👍 This looks good to me, one concern would be the bit name is_uri, can we find a more descriptive name for this bit? Also, @shenzhang920 you also confirmed there is no longer any usage of the is_wildmat_BIT? |
|
Thanks @shenzhang920 |
proxy/hdrs/URL.cc
Outdated
| {"}", false}, | ||
| {"é", false}}; | ||
|
|
||
| REGRESSION_TEST(STRICT_URI_PARSING)(RegressionTest *t, int /* level ATS_UNUSED */, int *pstatus) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't use all caps STRICT_URI_PARSING. How about ParseRules_strict_URI or something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wrap the test and it's data in #ifdef TS_HAS_TESTS.
|
I'm ok with Please take a look at this article about git commit messages. I think we should squash this down into a single commit with a good summary. |
|
Hey @shenzhang920 why did you close the pull request? If it's to squash I can help you squash it down and still retain this pull request. Let's sync offline. |
* Send EOS bit with DATA frame * Another fix identified during ASF PR review
* Improve performance of finding SNI Actions * Store and find servername in lower case * Make emplace result more readable Co-authored-by: Masaori Koshiba <masaori@apache.org>
* Revert "Gauge to Counter metrics for CDN Harmony (apache#544)" This reverts commit 1a83589. * Revert "[9.2.x] Improve performance of finding SNI Actions (apache#541)" This reverts commit 21cbfc0. * Revert "TSan: Make Thread::cur_time thread local (apache#9184) (apache#537)" This reverts commit d93ac0e. * Revert "Fix a crash in FetchSM (apache#546)" This reverts commit e03b829. * Revert "Remove dependency on OpenSSL's OCSP API (apache#538)" This reverts commit 00ad803. * Revert "Fix crashes on OCSP requests (apache#543)" This reverts commit bbb41b9. * Revert "Add OTHER to cipher TLS cipher metrics (apache#542)" This reverts commit 74dbd97. * Revert "rio: use bazinga-boringssl 19.x for boringssl build (apache#540)" This reverts commit 9c57330. * Revert "Use FetchSM for OCSP HTTP requests (apache#535)" This reverts commit dbd0ec7. * Do not schedule reconfigure event on schedule_imm_local Co-authored-by: Mo Chen <mochen@apache.org> Co-authored-by: Serris Lew <lserris@apple.com>
Adding a config option "proxy.config.http.strict_uri_parsing" to sends http status code 400 back to client if the URL includes non-RFC 3986 compliant character