Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use url::Url instead of http::Uri #382

Open
troelsarvin opened this issue Mar 7, 2022 · 6 comments
Open

Use url::Url instead of http::Uri #382

troelsarvin opened this issue Mar 7, 2022 · 6 comments
Labels
breaking Requires breaking backwards compatibility

Comments

@troelsarvin
Copy link

isahc's reliance on http::Uri makes it cumbersome to work with URLs which have hostnames with non-ASCII characters.
I suggest that isahc switch to using url::Url instead of http::Uri.

Motivations:

  • In 2022, it should be straight forward to work with URLs which are international.
  • Also, sematically, I claim it is more clear for an HTTP client to work with URLs rather than URIs. It's hard to imagine isahc being used with non-URL URIs.

The following code shows the current state of affairs in some crates:

/*
    In Carto.toml:

[dependencies]
url = "2"
http = "0.2"
reqwest = "0.11"

*/

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let url_str = "https://www.fc-højvang.dk/";

                                                    // Crate: url
    let url = url::Url::parse(url_str)?;            //
    println!("url: {}", url);                       // Prints URL with IDNA

                                                    // Crate: reqwest
    let reqwest_url = reqwest::Url::parse(url_str)?;//
    println!("reqwest_uri: {}", reqwest_url);       // Prints URL with IDNA

                                                    // Crate: http
    let http_uri = url_str.parse::<http::Uri>()?;   //
    println!("http_uri: {}", http_uri);             // FAILS
    /*
      Error message:
      InvalidUri(InvalidUriChar)
    */

    Ok(())
}
@sagebind
Copy link
Owner

sagebind commented Mar 8, 2022

I appreciate your concern on this, and I agree that Isahc ought to handle this better. However, this would be a breaking change and would require a 2.0 release at the very least. The problem is that the http crate is part of Isahc's public API -- isahc::Request is just a re-export of http::Request, so with the current design we can't choose whether or not http::Uri is used. Only the http crate can make that determination.

For an example that demonstrates this better:

use isahc::Request; // remember this is just `http::Request`

fn main() {
    let request = Request::get("https://sejs-svejbækbiler.dk/")
        .body(())
        .unwrap(); // panics here with `http::Error(InvalidUri(InvalidUriChar))`
                   // before any Isahc code even runs

    isahc::send(request).unwrap();
}

I see only two possible ways that we could handle this:

  1. The http crate is updated upstream to handle IDNA, either by default or with a feature flag, which we would enable.
  2. We remove the http crate from the public API and replace it with our own types, which among many things use the url crate internally.

If we could push for the first approach to get resolved then that would be most ideal, since we could fix this without breaking any existing code out there using Isahc. The second approach may have its own benefits, but also some drawbacks as well as requiring a significant API redesign.

When the http crate was first introduced, it was under the hope of being a common "interface crate" that would be generic enough that both HTTP clients and servers could adopt it and allowing the use of interoperable types between otherwise independent crates. This theoretically allows for things such as middleware to be written in an entirely generic way that would allow it to be used with any HTTP client or server instead of being tied to a specific implementation. This sounded like a noble goal, and Isahc was an early adopter those years ago when it was still young. In practice, I am not sure how well this vision materialized and may be worth some investigation.


As far as the URI/URL distinction, last time I looked that was a crazy rabbit hole as to which is more technically correct per the HTTP spec. In syntax, URLs are strictly speaking, always absolute, but HTTP allows relative URLs (which is a misnomer), so it is a bit looser than just URLs, but it is stricter than URIs which are very broad indeed. Then there's a mostly semantic debate as to what we call these "URIs as allowed by the HTTP spec", which some opt to call URIs (technically correct but not very precise) and others call URLs (more colloquially used, but not strictly correct).

Personally I don't really care about the semantic debate, you can call them URLs or URIs, as long as you're following the HTTP spec. Which http::Uri does as far as I know, in the most strict reading of the spec.

@sagebind sagebind added the breaking Requires breaking backwards compatibility label Mar 8, 2022
@troelsarvin
Copy link
Author

I've updated hyperium/http#259 pointing here.
Meanwhile, I wonder if it were possible to add some From/into functions to isahc, so that a url::Url could be used seamlessly with isahc through automatic conversoins between url:Url and http::Uri?

@sagebind
Copy link
Owner

sagebind commented Mar 8, 2022

Strictly speaking, it is not possible to implement conversions between url::Url and http::Uri from within Isahc due to the orphan rule; only the crate that defines a type can implement common traits for the type. So either the url crate or the http crate would have to implement a conversion from one to the other, though I'd doubt either crate would be willing to do that.

I'm not sure that would even help much in this scenario anyway, since Uri is still part of the public API and we can't change the http types to also accept a Url.

@Xuanwo
Copy link

Xuanwo commented Aug 19, 2022

First of all, replace http::Uri with url::Url is not acceptable.

The best solution for us is to address it from upstream. I submit PR hyperium/http#565 for that.

@Xuanwo Xuanwo moved this to 📋 Backlog in Xuanwo's Work Aug 19, 2022
@Xuanwo Xuanwo moved this from 📋 Backlog to 🔨 In Progress in Xuanwo's Work Aug 19, 2022
@Xuanwo Xuanwo moved this from 🔨 In Progress to 📬 Cancel in Xuanwo's Work Sep 2, 2022
@troelsarvin
Copy link
Author

Xuanwo's attempt at having a fix accepted in the http crate was not successfull, unfortunately :-(

@sagebind, could it be time for a version 2 of isahc, allowing for the API change?

@sagebind
Copy link
Owner

@troelsarvin I'm not currently planning on making this breaking change, even though technically yes, it could be done as part of version 2. This change would not be insignificant as it would require removing the http crate from the Isahc public API, which currently forms the backbone of Isahc's API. I'm just not sure that the juice is worth the squeeze to upend the main crate API in order to resolve this issue right now, or if the effort required is available.

All I can say is that the changes for version 2 are already planned, and this isn't in that list.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking Requires breaking backwards compatibility
Projects
None yet
Development

No branches or pull requests

3 participants