Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

http: Use GURL as HTTP URL parser utility #10599

Closed
wants to merge 25 commits into from
Closed

http: Use GURL as HTTP URL parser utility #10599

wants to merge 25 commits into from

Conversation

dio
Copy link
Member

@dio dio commented Apr 1, 2020

This replaces http_parser's URL parser with GURL.

Risk Level: Medium
Testing: Unit
Docs Changes: N/A
Release Notes: N/A

dio added 2 commits April 1, 2020 02:01
Use GURL as HTTP URL parser utility.

Signed-off-by: Dhi Aurrahman <dio@tetrate.io>
Signed-off-by: Dhi Aurrahman <dio@tetrate.io>
@dio dio changed the title http: Use GURL as HTTP URL parser utility. http: Use GURL as HTTP URL parser utility Apr 1, 2020
dio added 3 commits April 1, 2020 10:32
Signed-off-by: Dhi Aurrahman <dio@tetrate.io>
Signed-off-by: Dhi Aurrahman <dio@tetrate.io>
Signed-off-by: Dhi Aurrahman <dio@tetrate.io>
@dio
Copy link
Member Author

dio commented Apr 1, 2020

gurl on On windows:

Only clang-cl is supported on Windows, see https://crbug.com/988071

external/com_googlesource_googleurl\base/compiler_specific.h(11): fatal error C1189: #error:  "Only clang-cl is supported on Windows, see https://crbug.com/988071"

Will do #ifdef-ing.

dio added 4 commits April 2, 2020 07:10
Signed-off-by: Dhi Aurrahman <dio@tetrate.io>
Signed-off-by: Dhi Aurrahman <dio@tetrate.io>
Signed-off-by: Dhi Aurrahman <dio@tetrate.io>
Signed-off-by: Dhi Aurrahman <dio@tetrate.io>
@dio dio marked this pull request as ready for review April 2, 2020 22:12
urls = ["https://storage.googleapis.com/quiche-envoy-integration/googleurl_dbf5ad147f60afc125e99db7549402af49a5eae8.tar.gz"],
sha256 = "5bdbecce22d522f03ac86c6a694d5ff76780a16d6d682910e87ac156b0a25b21",
strip_prefix = "googleurl-dbf5ad1",
urls = ["https://github.com/dio/googleurl/archive/dbf5ad1.tar.gz"],
Copy link
Member Author

@dio dio Apr 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A temporary fix, since it has a top-level BUILD file and a build directory (which is problematic when extracting on mac and windows). I renamed the BUILD file to BUILD.bazel, I think removing it is OK.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume can just patch?

Copy link
Member Author

@dio dio Apr 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like we can't, since it is failing when http_archive extracting the tarball, it is stopped when trying to extract BUILD when it saw build dir is already there. I'm not sure how we can have a patch on the downloaded tarball before extracting.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably @danzh2010 could shed some light here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A temporary fix, since it has a top-level BUILD file and a build directory (which is problematic when extracting on mac and windows). I renamed the BUILD file to BUILD.bazel, I think removing it is OK.

The top level BUILD file is a dummy but needed for the build_config.bzl to setup default copts. But they probably can be moved to a subdirectory instead.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As to build directory, it is from chromium, so nothing I can do there. What's the exact issue of having it under root directory?

Copy link
Member Author

@dio dio Apr 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@danzh2010 with the original archive, http_archive tries to extract BUILD file while the build directory is already extracted (the build directory is extracted before the BUILD file from the tarball) and http_archive throws error. It happens only for systems with case-insensitive filesystem like Windows and macOS.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I move the top level BUILD file and build_config.bzl to some sub-directory, would it help your case?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! Thanks @danzh2010!

Copy link
Member Author

@dio dio Jun 20, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@danzh2010 sorry. Can we still pick this up? (moving the top-level BUILD file sub-directory)

@htuch htuch self-assigned this Apr 3, 2020
Copy link
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally rad. This gets us most of the way to resolving #6588 (once we fold the path normalization into the Url class).

"url_util.h",
],
copts = build_config.default_copts,
- linkopts = ["-licuuc"],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why remove this if we now have icuuc?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we depend on "@org_unicode_icuuc//:icuuc" directly.

constexpr char HTTPS[] = "https";
#endif

bool Url::initialize(absl::string_view absolute_url) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At some point, can we get rid of https://github.com/envoyproxy/envoy/tree/master/source/common/chromium_url by doing the normalization in this class via GURL?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so yeah.

tags = ["requires-rtti"],
visibility = ["//visibility:private"],
deps = [":headers"],
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an autoconf dependency, can we just use foreign_cc to avoid having to hand-maintain a bespoke BUILD file for the project?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm testing it, but it seems the build time is significantly increasing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, I wonder if it's doing something like building tests or additional libraries. Might be possible to control with some autoconf flags...

Copy link
Member Author

@dio dio Apr 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe I have turned off everything (that makes sense, like examples, tests, even tools). And it is tricky to force using msvc (when using configure it tends to use mingw and it is failed at configure stage, and I don't think we want to mix mingw and msvc).

urls = ["https://storage.googleapis.com/quiche-envoy-integration/googleurl_dbf5ad147f60afc125e99db7549402af49a5eae8.tar.gz"],
sha256 = "5bdbecce22d522f03ac86c6a694d5ff76780a16d6d682910e87ac156b0a25b21",
strip_prefix = "googleurl-dbf5ad1",
urls = ["https://github.com/dio/googleurl/archive/dbf5ad1.tar.gz"],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume can just patch?

source/common/http/url_utility.cc Outdated Show resolved Hide resolved
dio added 5 commits April 6, 2020 04:47
Signed-off-by: Dhi Aurrahman <dio@tetrate.io>
Signed-off-by: Dhi Aurrahman <dio@tetrate.io>
Signed-off-by: Dhi Aurrahman <dio@tetrate.io>
Signed-off-by: Dhi Aurrahman <dio@tetrate.io>
Signed-off-by: Dhi Aurrahman <dio@tetrate.io>
@dio
Copy link
Member Author

dio commented Apr 6, 2020

/azp run envoy-presubmit

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

dio added 6 commits April 6, 2020 21:52
Signed-off-by: Dhi Aurrahman <dio@tetrate.io>
Signed-off-by: Dhi Aurrahman <dio@tetrate.io>
Signed-off-by: Dhi Aurrahman <dio@tetrate.io>
Signed-off-by: Dhi Aurrahman <dio@tetrate.io>
Signed-off-by: Dhi Aurrahman <dio@tetrate.io>
Signed-off-by: Dhi Aurrahman <dio@tetrate.io>
@htuch htuch mentioned this pull request Apr 6, 2020
"string_piece.h",
"string_piece_forward.h",
"string_util.h",
- "string_util_posix.h",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the rationale for this patch to grow larger?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for accommodating windows with msvc compiler that we have.

tags = ["requires-rtti"],
visibility = ["//visibility:private"],
deps = [":headers"],
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, I wonder if it's doing something like building tests or additional libraries. Might be possible to control with some autoconf flags...

@dio
Copy link
Member Author

dio commented Apr 6, 2020

/azp run envoy-presubmit

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Comment on lines +20 to +23
-#if defined(__clang__) && __has_attribute(uninitialized)
+#if defined(__clang__)
+#if defined(__has_attribute)
+#if __has_attribute(uninitialized)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we put /Za when compiling on Windows, MSVC's C4067 warning is treated as an error.

cc_library(
name = "strings",
srcs = [
- "string16.cc",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

string16 doesn't make sense to use on 2-byte wchar_t systems (e.g. Windows).

Comment on lines -107 to -109
downstream_headers.setScheme(absolute_url.scheme());
downstream_headers.setHost(absolute_url.host_and_port());
downstream_headers.setPath(absolute_url.path_and_query_params());
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caught by clang-tidy.

Signed-off-by: Dhi Aurrahman <dio@tetrate.io>
@dio
Copy link
Member Author

dio commented Apr 7, 2020

Hum, the coverage is changed:

Code coverage 96.52794344082757 is lower than limit of 97.0

Will check.

dio added 2 commits April 7, 2020 10:02
Signed-off-by: Dhi Aurrahman <dio@tetrate.io>
@dio
Copy link
Member Author

dio commented Apr 7, 2020

/azp run envoy-presubmit

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@dio
Copy link
Member Author

dio commented Apr 8, 2020

😔 The macOS job was canceled.

@dio
Copy link
Member Author

dio commented Apr 16, 2020

@lizan this is me having a bad luck or it is a fact that the macOS build is too long and often cancelled or failed to call RBE?

e.g.

Caused by: io.grpc.StatusRuntimeException: UNAVAILABLE: Unable to resolve host remotebuildexecution.googleapis.com

@stale
Copy link

stale bot commented Apr 23, 2020

This pull request has been automatically marked as stale because it has not had activity in the last 7 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

@stale stale bot added the stale stalebot believes this issue/PR has not been touched recently label Apr 23, 2020
@stale
Copy link

stale bot commented Apr 30, 2020

This pull request has been automatically closed because it has not had activity in the last 14 days. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

@stale stale bot closed this Apr 30, 2020
@htuch
Copy link
Member

htuch commented Jun 19, 2020

@dio looks like this one went stale, do you recall what the major blockers here were? I'm interested in URI parsing for the URI encoding in https://docs.google.com/document/d/1zZav-IYxMO0mP2A7y5XHBa9v0eXyirw1CxrLLSUqWR8/edit.

@dio
Copy link
Member Author

dio commented Jun 21, 2020

@htuch I have moved this PR to #11670. I think the decision of doing bespoke "BUILD" (for building google URL and Unicode libs) is the main blocker. While I think, since I have accommodated this from tensorflow, I hope it is fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale stalebot believes this issue/PR has not been touched recently
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants