Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configuration to disable Expect Header being set by default #303

Closed
tailrecur opened this issue Feb 18, 2021 · 9 comments · Fixed by #340
Closed

Configuration to disable Expect Header being set by default #303

tailrecur opened this issue Feb 18, 2021 · 9 comments · Fixed by #340
Labels
feature A new feature!
Milestone

Comments

@tailrecur
Copy link

Thanks a lot for this library!
I'm learning many Rust idioms from reading the code.

libcurl sets the Expect: 100-continue header by default for POST requests whose body exceeds a certain size. This leads to two round-trips to the server and so worse performance. This blog post describes the issue and a possible solution.

I see that curl-rust sets the CURLOPT_HTTPHEADER but I can't figure out how to invoke the http_headers method with parameters in the shape that libcurl expects ie: Expect:

I tried setting an empty Expect header .header("Expect", "") but this does not seem to deactivate the curl behaviour and leads to the server rejecting the request with a 417 Expectation Failed

I tried setting the header as .header("Expect:", "") but this leads to a http::Error(InvalidHeaderName)

Any suggestions on how to set this header to disable the libcurl behaviour?

Thanks in advance!

@tailrecur
Copy link
Author

On further digging, I see that the code has specific handling for this case.

However, shouldn't that line be pushing : character instead of ; into the header string?

If this is correct, then I can create a PR with a fix.

@tailrecur
Copy link
Author

I see that using ; was an intentional choice to fix #209

I guess then a different extension function to disable default curl behaviour would make sense.

@sagebind
Copy link
Owner

sagebind commented Feb 19, 2021

Thank you for filing an issue! This does sound like something we should let you configure. Since we try to abstract away some of curl's unusual APIs, we don't expose this directly and right now there is no way to say, "Don't send this header". Most users would expect a call like .header("Expect", "") to send a header named Expect with an empty string as the value, so that is exactly what we do.

Curl's CURLOPT_HTTPHEADER accepts some unusual syntax that we want to avoid exposing directly. The ; is a curl-ism that means to send a header with a blank value. From the libcurl manual:

To add a header with no content (nothing to the right side of the colon), use the form 'MyHeader;' (note the ending semicolon).

Unsetting a header is a weird operation that you probably don't normally need to do, since most automatically populated headers are usually done so for a good reason (of course, you can remove previously added user-defined headers via remove).

I guess then a different extension function to disable default curl behaviour would make sense.

This is the direction I would go: offer an explicit function to be able to control the behavior. That way we can avoid curl's "stringly-typed" way of doing it and prefer a more Rusty API. This is probably a good opportunity to allow you to control CURLOPT_EXPECT_100_TIMEOUT_MS as well. Here's a quick sketch of how I might design such an API:

pub trait Configurable {
    // ...

    fn expect_100<T>(self, expect: T) -> Self
    where
        T: Into<Expect100>;
}

pub struct Expect100 {
    timeout: Option<Duration>,
}

impl Expect100 {
    pub const fn disabled() -> Self {
        Self { timeout: None }
    }

    pub const fn enabled() -> Self {
        Self {
            timeout: Some(Duration::from_secs(1)), // default timeout
        }
    }

    pub fn with_timeout<T>(timeout: T) -> Self
    where
        T: Into<Duration>
    {
        Self {
            timeout: Some(timeout.into())
        }
    }
}

impl Default for Expect100 {
    pub fn default() -> Self {
        Self::enabled()
    }
}

impl From<bool> for Expect100 {
    fn from(value: bool) -> Self {
        if value {
            Self::enabled()
        } else {
            Self::disabled()
        }
    }
}

Using it might look like:

Request::post("/")
    .expect_100(true) // default
    .expect_100(false) // don't want it
    .expect_100(Expect100::with_timeout(Duration::from_millis(100))) // keep it, but don't wait so long
    .body(bytes)?
    .send()?

@sagebind sagebind added the feature A new feature! label Feb 19, 2021
@tailrecur
Copy link
Author

tailrecur commented Feb 19, 2021

Thanks for the quick reply!

My early thoughts around this (like in this commit) are roughly along the same line of what you proposed.

Counter-intuitively, I'm finding that turning off the Expect header actually makes the POST request slower so that the total latency with or without the Expect header is roughly similar.

I'm actually building a http latency analysis tool similar to httpstat but to track behaviour across multiple requests with connection re-use. I set it up to make multiple requests with connection re-use and alternate between enabling and disabling the Expect: header. I see the following output with timing information:

Without-Expect: Status: 200 OK   DNS: 63ms   TCP:  94ms   TLS: 276ms   Start:   0ms   Request: 242ms   Total:  676ms
With-Expect   : Status: 200 OK   DNS:  0ms   TCP:   0ms   TLS:   0ms   Start:  79ms   Request:  92ms   Total:  171ms
Without-Expect: Status: 200 OK   DNS:  0ms   TCP:   0ms   TLS:   0ms   Start:   0ms   Request: 195ms   Total:  195ms
With-Expect   : Status: 200 OK   DNS:  0ms   TCP:   0ms   TLS:   0ms   Start:  70ms   Request:  89ms   Total:  159ms
Without-Expect: Status: 200 OK   DNS:  0ms   TCP:   0ms   TLS:   0ms   Start:   0ms   Request: 279ms   Total:  279ms
With-Expect   : Status: 200 OK   DNS:  0ms   TCP:   0ms   TLS:   0ms   Start: 137ms   Request:  67ms   Total:  204ms

where Start includes the Expect roundtrip and Request is the time to make the actual POST request. This behaviour seems to vary by OS (Debian vs OSX vs Oracle Linux) so I'm wondering if the underlying libcurl behaviour has something to do with this.

Thanks for the tip with CURLOPT_EXPECT_100_TIMEOUT_MS! Any other tips on this will be greatly appreciated.

I'll create a PR once I have this odd behaviour figured out.

@tailrecur
Copy link
Author

@sagebind Forgive the naive questions since I'm still learning about the Rust ecosystem.

I'd like to use the new method introduced in curl-rust in the PR that I will create for isahc.

Do I need to wait for cur-rust to cut a new release which includes these changes before I can use it here or is there another way to do this?

@sagebind
Copy link
Owner

Forgive the naive questions since I'm still learning about the Rust ecosystem.

No problem, I like questions. And welcome to Rust!

You'll need to wait for a new release since Isahc uses only released versions of dependencies. Indeed, Crates.io does not allow you to publish crates that depend on anything other than other published crates on Crates.io. It's a quality control thing, among other reasons.

I also happen to be a maintainer of curl-rust 😉 so I can probably have a new release published this week to be used here. For now, you can also work on developing this feature now by changing Cargo.toml to point to the Git master branch of curl-rust like this:

 [dependencies]
 crossbeam-utils = "0.8"
-curl = "0.4.34"
-curl-sys = "0.4.37"
+curl = { git = "https://github.com/alexcrichton/curl-rust" }
+curl-sys = { git = "https://github.com/alexcrichton/curl-rust" }
 futures-lite = "1.11"
 http = "0.2.1"

Of course we can't merge your PR in this state, but at least it can be worked on.

@tailrecur
Copy link
Author

tailrecur commented Feb 24, 2021

@sagebind Thanks for the reply.

I had already switched to using local path and then git references to unblock myself. Everything seems to work fine when compiling with gcc. I need a static binary for ease of sharing and am trying to use musl for this (my understanding is that musl does not dynamically link to other libraries like gcc)
However, the final binary segfaults immediately with with signal SIGSEGV and I'm trying to figure out how to debug this.

Anyways, I'll plod along and get back to this topic when a new upstream version becomes available.

@sagebind
Copy link
Owner

sagebind commented Mar 9, 2021

Both support for expect_100_timeout and fixes for TCP nodelay have now been included in the curl 0.4.35 release.

tailrecur added a commit to tailrecur/isahc that referenced this issue Mar 11, 2021
`curl 0.4.35` added the option to configure the timeout value that `curl` will wait for when using an `Expect` header before sending the request body.

This commit adds the ability to configure this timeout via the `Configurable` trait within `isahc`.

Fixes sagebind#303
tailrecur added a commit to tailrecur/isahc that referenced this issue Mar 11, 2021
`curl 0.4.35` added the option to configure the timeout value that `curl` will wait for when using an `Expect` header before sending the request body.

This commit adds the ability to configure this timeout via the `Configurable` trait within `isahc`.

Fixes sagebind#303
@sagebind sagebind added this to the 1.5.0 milestone Jul 28, 2021
sagebind added a commit that referenced this issue Aug 22, 2021
Add a configuration option for changing how POST requests behave under HTTP/1.1 and the `Expect: 100-continue` request behavior. Allow the user to change the timeout to a different duration or to disable the header entirely.

Replaces #311.

Fixes #303.
sagebind added a commit that referenced this issue Aug 22, 2021
Add a configuration option for changing how POST requests behave under HTTP/1.1 and the `Expect: 100-continue` request behavior. Allow the user to change the timeout to a different duration or to disable the header entirely.

Replaces #311.

Fixes #303.
@sagebind
Copy link
Owner

This can now be configured in the 1.5.0 release!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature A new feature!
Projects
None yet
2 participants